Fix Brin Private Spool Initialization (src/backend/access/brin/brin.c)
Hi,
The commit b437571 <http://b437571714707bc6466abde1a0af5e69aaade09c> I
think has an oversight.
When allocate memory and initialize private spool in function:
_brin_leader_participate_as_worker
The behavior is the bs_spool (heap and index fields)
are left empty.
The code affected is:
buildstate->bs_spool = (BrinSpool *) palloc0(sizeof(BrinSpool));
- buildstate->bs_spool->heap = buildstate->bs_spool->heap;
- buildstate->bs_spool->index = buildstate->bs_spool->index;
+ buildstate->bs_spool->heap = heap;
+ buildstate->bs_spool->index = index;
Is the fix correct?
best regards,
Ranier Vilela
Attachments:
001-fix-brin-private-spool-initialization.patchapplication/octet-stream; name=001-fix-brin-private-spool-initialization.patchDownload
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 23f081389b..40ee24591c 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -2706,8 +2706,8 @@ _brin_leader_participate_as_worker(BrinBuildState *buildstate, Relation heap, Re
/* Allocate memory and initialize private spool */
buildstate->bs_spool = (BrinSpool *) palloc0(sizeof(BrinSpool));
- buildstate->bs_spool->heap = buildstate->bs_spool->heap;
- buildstate->bs_spool->index = buildstate->bs_spool->index;
+ buildstate->bs_spool->heap = heap;
+ buildstate->bs_spool->index = index;
/*
* Might as well use reliable figure when doling out maintenance_work_memOn 12/26/23 19:10, Ranier Vilela wrote:
Hi,
The commit b437571 <http://b437571714707bc6466abde1a0af5e69aaade09c> I
think has an oversight.
When allocate memory and initialize private spool in function:
_brin_leader_participate_as_workerThe behavior is the bs_spool (heap and index fields)
are left empty.The code affected is: buildstate->bs_spool = (BrinSpool *) palloc0(sizeof(BrinSpool)); - buildstate->bs_spool->heap = buildstate->bs_spool->heap; - buildstate->bs_spool->index = buildstate->bs_spool->index; + buildstate->bs_spool->heap = heap; + buildstate->bs_spool->index = index;Is the fix correct?
Thanks for noticing this. Yes, I believe this is a bug - the assignments
are certainly wrong, it leaves the fields set to NULL.
I wonder how come this didn't fail during testing. Surely, if the leader
participates as a worker, the tuplesort_begin_index_brin shall be called
with heap/index being NULL, leading to some failure during the sort. But
maybe this means we don't actually need the heap/index fields, it's just
a copy of TuplesortIndexArg, but BRIN does not need that because we sort
the tuples by blkno, and we don't need the descriptors for that.
In any case, the _brin_parallel_scan_and_build does not actually need
the separate heap/index arguments, those are already in the spool.
I'll try to figure out if we want to simplify the tuplesort or remove
the arguments from _brin_parallel_scan_and_build.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Em ter., 26 de dez. de 2023 às 19:07, Tomas Vondra <
tomas.vondra@enterprisedb.com> escreveu:
On 12/26/23 19:10, Ranier Vilela wrote:
Hi,
The commit b437571 <http://b437571714707bc6466abde1a0af5e69aaade09c> I
think has an oversight.
When allocate memory and initialize private spool in function:
_brin_leader_participate_as_workerThe behavior is the bs_spool (heap and index fields)
are left empty.The code affected is: buildstate->bs_spool = (BrinSpool *) palloc0(sizeof(BrinSpool)); - buildstate->bs_spool->heap = buildstate->bs_spool->heap; - buildstate->bs_spool->index = buildstate->bs_spool->index; + buildstate->bs_spool->heap = heap; + buildstate->bs_spool->index = index;Is the fix correct?
Thanks for noticing this.
You're welcome.
Yes, I believe this is a bug - the assignments
are certainly wrong, it leaves the fields set to NULL.I wonder how come this didn't fail during testing. Surely, if the leader
participates as a worker, the tuplesort_begin_index_brin shall be called
with heap/index being NULL, leading to some failure during the sort. But
maybe this means we don't actually need the heap/index fields, it's just
a copy of TuplesortIndexArg, but BRIN does not need that because we sort
the tuples by blkno, and we don't need the descriptors for that.
Unfortunately I can't test on Windows, since I can't build with meson on
Windows.
In any case, the _brin_parallel_scan_and_build does not actually need
the separate heap/index arguments, those are already in the spool.
Yeah, for sure.
I'll try to figure out if we want to simplify the tuplesort or remove
the arguments from _brin_parallel_scan_and_build.
Thank you for your work.
best regards,
Ranier Vilela
On 12/27/23 12:37, Ranier Vilela wrote:
Em ter., 26 de dez. de 2023 às 19:07, Tomas Vondra
<tomas.vondra@enterprisedb.com <mailto:tomas.vondra@enterprisedb.com>>
escreveu:On 12/26/23 19:10, Ranier Vilela wrote:
Hi,
The commit b437571
<http://b437571714707bc6466abde1a0af5e69aaade09c
<http://b437571714707bc6466abde1a0af5e69aaade09c>> Ithink has an oversight.
When allocate memory and initialize private spool in function:
_brin_leader_participate_as_workerThe behavior is the bs_spool (heap and index fields)
are left empty.The code affected is: buildstate->bs_spool = (BrinSpool *) palloc0(sizeof(BrinSpool)); - buildstate->bs_spool->heap = buildstate->bs_spool->heap; - buildstate->bs_spool->index = buildstate->bs_spool->index; + buildstate->bs_spool->heap = heap; + buildstate->bs_spool->index = index;Is the fix correct?
Thanks for noticing this.
You're welcome.
Yes, I believe this is a bug - the assignments
are certainly wrong, it leaves the fields set to NULL.I wonder how come this didn't fail during testing. Surely, if the leader
participates as a worker, the tuplesort_begin_index_brin shall be called
with heap/index being NULL, leading to some failure during the sort. But
maybe this means we don't actually need the heap/index fields, it's just
a copy of TuplesortIndexArg, but BRIN does not need that because we sort
the tuples by blkno, and we don't need the descriptors for that.Unfortunately I can't test on Windows, since I can't build with meson on
Windows.In any case, the _brin_parallel_scan_and_build does not actually need
the separate heap/index arguments, those are already in the spool.Yeah, for sure.
I'll try to figure out if we want to simplify the tuplesort or remove
the arguments from _brin_parallel_scan_and_build.
Here is a patch simplifying the BRIN parallel create code a little bit.
As I suspected, we don't need the heap/index in the spool at all, and we
don't need to pass it to tuplesort_begin_index_brin either - we only
need blkno, and we have that in the datum1 field. This also means we
don't need TuplesortIndexBrinArg.
I'll push this tomorrow, probably.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
brin-parallel-create-simplify.patchtext/x-patch; charset=UTF-8; name=brin-parallel-create-simplify.patchDownload
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 23f081389b2..a58f662f2cf 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -56,8 +56,6 @@
typedef struct BrinSpool
{
Tuplesortstate *sortstate; /* state data for tuplesort.c */
- Relation heap;
- Relation index;
} BrinSpool;
/*
@@ -1144,8 +1142,6 @@ brinbuild(Relation heap, Relation index, IndexInfo *indexInfo)
RelationGetNumberOfBlocks(heap));
state->bs_spool = (BrinSpool *) palloc0(sizeof(BrinSpool));
- state->bs_spool->heap = heap;
- state->bs_spool->index = index;
/*
* Attempt to launch parallel worker scan when required
@@ -1200,8 +1196,7 @@ brinbuild(Relation heap, Relation index, IndexInfo *indexInfo)
* factor.
*/
state->bs_spool->sortstate =
- tuplesort_begin_index_brin(heap, index,
- maintenance_work_mem, coordinate,
+ tuplesort_begin_index_brin(maintenance_work_mem, coordinate,
TUPLESORT_NONE);
/*
@@ -2706,8 +2701,6 @@ _brin_leader_participate_as_worker(BrinBuildState *buildstate, Relation heap, Re
/* Allocate memory and initialize private spool */
buildstate->bs_spool = (BrinSpool *) palloc0(sizeof(BrinSpool));
- buildstate->bs_spool->heap = buildstate->bs_spool->heap;
- buildstate->bs_spool->index = buildstate->bs_spool->index;
/*
* Might as well use reliable figure when doling out maintenance_work_mem
@@ -2736,8 +2729,8 @@ _brin_leader_participate_as_worker(BrinBuildState *buildstate, Relation heap, Re
static void
_brin_parallel_scan_and_build(BrinBuildState *state, BrinSpool *brinspool,
BrinShared *brinshared, Sharedsort *sharedsort,
- Relation heap, Relation index, int sortmem,
- bool progress)
+ Relation heap, Relation index,
+ int sortmem, bool progress)
{
SortCoordinate coordinate;
TableScanDesc scan;
@@ -2751,9 +2744,7 @@ _brin_parallel_scan_and_build(BrinBuildState *state, BrinSpool *brinspool,
coordinate->sharedsort = sharedsort;
/* Begin "partial" tuplesort */
- brinspool->sortstate = tuplesort_begin_index_brin(brinspool->heap,
- brinspool->index,
- sortmem, coordinate,
+ brinspool->sortstate = tuplesort_begin_index_brin(sortmem, coordinate,
TUPLESORT_NONE);
/* Join parallel scan */
@@ -2763,8 +2754,8 @@ _brin_parallel_scan_and_build(BrinBuildState *state, BrinSpool *brinspool,
scan = table_beginscan_parallel(heap,
ParallelTableScanFromBrinShared(brinshared));
- reltuples = table_index_build_scan(heap, index, indexInfo, true, true,
- brinbuildCallbackParallel, state, scan);
+ reltuples = table_index_build_scan(heap, index, indexInfo,
+ true, true, brinbuildCallbackParallel, state, scan);
/* insert the last item */
form_and_spill_tuple(state);
@@ -2846,8 +2837,6 @@ _brin_parallel_build_main(dsm_segment *seg, shm_toc *toc)
/* Initialize worker's own spool */
buildstate->bs_spool = (BrinSpool *) palloc0(sizeof(BrinSpool));
- buildstate->bs_spool->heap = heapRel;
- buildstate->bs_spool->index = indexRel;
/* Look up shared state private to tuplesort.c */
sharedsort = shm_toc_lookup(toc, PARALLEL_KEY_TUPLESORT, false);
@@ -2865,7 +2854,8 @@ _brin_parallel_build_main(dsm_segment *seg, shm_toc *toc)
_brin_parallel_scan_and_build(buildstate, buildstate->bs_spool,
brinshared, sharedsort,
- heapRel, indexRel, sortmem, false);
+ heapRel, indexRel,
+ sortmem, false);
/* Report WAL/buffer usage during parallel execution */
bufferusage = shm_toc_lookup(toc, PARALLEL_KEY_BUFFER_USAGE, false);
diff --git a/src/backend/utils/sort/tuplesortvariants.c b/src/backend/utils/sort/tuplesortvariants.c
index 90fc605f1ca..27425880a5f 100644
--- a/src/backend/utils/sort/tuplesortvariants.c
+++ b/src/backend/utils/sort/tuplesortvariants.c
@@ -137,16 +137,6 @@ typedef struct
uint32 max_buckets;
} TuplesortIndexHashArg;
-/*
- * Data struture pointed by "TuplesortPublic.arg" for the index_brin subcase.
- */
-typedef struct
-{
- TuplesortIndexArg index;
-
- /* XXX do we need something here? */
-} TuplesortIndexBrinArg;
-
/*
* Data struture pointed by "TuplesortPublic.arg" for the Datum case.
* Set by tuplesort_begin_datum and used only by the DatumTuple routines.
@@ -562,20 +552,13 @@ tuplesort_begin_index_gist(Relation heapRel,
}
Tuplesortstate *
-tuplesort_begin_index_brin(Relation heapRel,
- Relation indexRel,
- int workMem,
+tuplesort_begin_index_brin(int workMem,
SortCoordinate coordinate,
int sortopt)
{
Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
sortopt);
TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext;
- TuplesortIndexBrinArg *arg;
-
- oldcontext = MemoryContextSwitchTo(base->maincontext);
- arg = (TuplesortIndexBrinArg *) palloc(sizeof(TuplesortIndexBrinArg));
#ifdef TRACE_SORT
if (trace_sort)
@@ -592,12 +575,7 @@ tuplesort_begin_index_brin(Relation heapRel,
base->writetup = writetup_index_brin;
base->readtup = readtup_index_brin;
base->haveDatum1 = true;
- base->arg = arg;
-
- arg->index.heapRel = heapRel;
- arg->index.indexRel = indexRel;
-
- MemoryContextSwitchTo(oldcontext);
+ base->arg = NULL;
return state;
}
diff --git a/src/include/utils/tuplesort.h b/src/include/utils/tuplesort.h
index 357eb35311d..103ced4005f 100644
--- a/src/include/utils/tuplesort.h
+++ b/src/include/utils/tuplesort.h
@@ -430,9 +430,7 @@ extern Tuplesortstate *tuplesort_begin_index_gist(Relation heapRel,
Relation indexRel,
int workMem, SortCoordinate coordinate,
int sortopt);
-extern Tuplesortstate *tuplesort_begin_index_brin(Relation heapRel,
- Relation indexRel,
- int workMem, SortCoordinate coordinate,
+extern Tuplesortstate *tuplesort_begin_index_brin(int workMem, SortCoordinate coordinate,
int sortopt);
extern Tuplesortstate *tuplesort_begin_datum(Oid datumType,
Oid sortOperator, Oid sortCollation,
Em qui., 28 de dez. de 2023 às 22:16, Tomas Vondra <
tomas.vondra@enterprisedb.com> escreveu:
On 12/27/23 12:37, Ranier Vilela wrote:
Em ter., 26 de dez. de 2023 às 19:07, Tomas Vondra
<tomas.vondra@enterprisedb.com <mailto:tomas.vondra@enterprisedb.com>>
escreveu:On 12/26/23 19:10, Ranier Vilela wrote:
Hi,
The commit b437571
<http://b437571714707bc6466abde1a0af5e69aaade09c
<http://b437571714707bc6466abde1a0af5e69aaade09c>> Ithink has an oversight.
When allocate memory and initialize private spool in function:
_brin_leader_participate_as_workerThe behavior is the bs_spool (heap and index fields)
are left empty.The code affected is: buildstate->bs_spool = (BrinSpool *) palloc0(sizeof(BrinSpool)); - buildstate->bs_spool->heap = buildstate->bs_spool->heap; - buildstate->bs_spool->index = buildstate->bs_spool->index; + buildstate->bs_spool->heap = heap; + buildstate->bs_spool->index = index;Is the fix correct?
Thanks for noticing this.
You're welcome.
Yes, I believe this is a bug - the assignments
are certainly wrong, it leaves the fields set to NULL.I wonder how come this didn't fail during testing. Surely, if the
leader
participates as a worker, the tuplesort_begin_index_brin shall be
called
with heap/index being NULL, leading to some failure during the sort.
But
maybe this means we don't actually need the heap/index fields, it's
just
a copy of TuplesortIndexArg, but BRIN does not need that because we
sort
the tuples by blkno, and we don't need the descriptors for that.
Unfortunately I can't test on Windows, since I can't build with meson on
Windows.In any case, the _brin_parallel_scan_and_build does not actually need
the separate heap/index arguments, those are already in the spool.Yeah, for sure.
I'll try to figure out if we want to simplify the tuplesort or remove
the arguments from _brin_parallel_scan_and_build.Here is a patch simplifying the BRIN parallel create code a little bit.
As I suspected, we don't need the heap/index in the spool at all, and we
don't need to pass it to tuplesort_begin_index_brin either - we only
need blkno, and we have that in the datum1 field. This also means we
don't need TuplesortIndexBrinArg.
With Windows 10, msvc 2022, compile end pass ninja test.
But, if you allow me, I would like to try another approach to
simplification.
Instead of increasing the arguments in the call, wouldn't it be better to
decrease them
and this way all arguments will be passed in the registers instead of on a
stack?
bs_spool may well contain this data and will probably be useful in the
future.
I made a v1 version, based on your patch, for your consideration.
best regards,
Ranier Vilela
Attachments:
v1-brin-parallel-create-simplify.patchapplication/octet-stream; name=v1-brin-parallel-create-simplify.patchDownload
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 23f081389b..6650e543a0 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -58,6 +58,7 @@ typedef struct BrinSpool
Tuplesortstate *sortstate; /* state data for tuplesort.c */
Relation heap;
Relation index;
+ uint32 sortmem;
} BrinSpool;
/*
@@ -236,11 +237,9 @@ static Size _brin_parallel_estimate_shared(Relation heap, Snapshot snapshot);
static void _brin_leader_participate_as_worker(BrinBuildState *buildstate,
Relation heap, Relation index);
static void _brin_parallel_scan_and_build(BrinBuildState *buildstate,
- BrinSpool *brinspool,
BrinShared *brinshared,
Sharedsort *sharedsort,
- Relation heap, Relation index,
- int sortmem, bool progress);
+ bool progress);
/*
* BRIN handler function: return IndexAmRoutine with access method parameters
@@ -1200,8 +1199,7 @@ brinbuild(Relation heap, Relation index, IndexInfo *indexInfo)
* factor.
*/
state->bs_spool->sortstate =
- tuplesort_begin_index_brin(heap, index,
- maintenance_work_mem, coordinate,
+ tuplesort_begin_index_brin(maintenance_work_mem, coordinate,
TUPLESORT_NONE);
/*
@@ -2702,23 +2700,22 @@ static void
_brin_leader_participate_as_worker(BrinBuildState *buildstate, Relation heap, Relation index)
{
BrinLeader *brinleader = buildstate->bs_leader;
- int sortmem;
/* Allocate memory and initialize private spool */
buildstate->bs_spool = (BrinSpool *) palloc0(sizeof(BrinSpool));
- buildstate->bs_spool->heap = buildstate->bs_spool->heap;
- buildstate->bs_spool->index = buildstate->bs_spool->index;
+ buildstate->bs_spool->heap = heap;
+ buildstate->bs_spool->index = index;
/*
* Might as well use reliable figure when doling out maintenance_work_mem
* (when requested number of workers were not launched, this will be
* somewhat higher than it is for other workers).
*/
- sortmem = maintenance_work_mem / brinleader->nparticipanttuplesorts;
+ buildstate->bs_spool->sortmem = maintenance_work_mem / brinleader->nparticipanttuplesorts;
/* Perform work common to all participants */
- _brin_parallel_scan_and_build(buildstate, buildstate->bs_spool, brinleader->brinshared,
- brinleader->sharedsort, heap, index, sortmem, true);
+ _brin_parallel_scan_and_build(buildstate, brinleader->brinshared,
+ brinleader->sharedsort, true);
}
/*
@@ -2734,15 +2731,15 @@ _brin_leader_participate_as_worker(BrinBuildState *buildstate, Relation heap, Re
* When this returns, workers are done, and need only release resources.
*/
static void
-_brin_parallel_scan_and_build(BrinBuildState *state, BrinSpool *brinspool,
+_brin_parallel_scan_and_build(BrinBuildState *state,
BrinShared *brinshared, Sharedsort *sharedsort,
- Relation heap, Relation index, int sortmem,
bool progress)
{
SortCoordinate coordinate;
TableScanDesc scan;
double reltuples;
IndexInfo *indexInfo;
+ BrinSpool *bs_spool;
/* Initialize local tuplesort coordination state */
coordinate = palloc0(sizeof(SortCoordinateData));
@@ -2751,26 +2748,25 @@ _brin_parallel_scan_and_build(BrinBuildState *state, BrinSpool *brinspool,
coordinate->sharedsort = sharedsort;
/* Begin "partial" tuplesort */
- brinspool->sortstate = tuplesort_begin_index_brin(brinspool->heap,
- brinspool->index,
- sortmem, coordinate,
+ bs_spool = state->bs_spool;
+ bs_spool->sortstate = tuplesort_begin_index_brin(bs_spool->sortmem, coordinate,
TUPLESORT_NONE);
/* Join parallel scan */
- indexInfo = BuildIndexInfo(index);
+ indexInfo = BuildIndexInfo(bs_spool->index);
indexInfo->ii_Concurrent = brinshared->isconcurrent;
- scan = table_beginscan_parallel(heap,
+ scan = table_beginscan_parallel(bs_spool->heap,
ParallelTableScanFromBrinShared(brinshared));
- reltuples = table_index_build_scan(heap, index, indexInfo, true, true,
- brinbuildCallbackParallel, state, scan);
+ reltuples = table_index_build_scan(bs_spool->heap, bs_spool->index, indexInfo,
+ true, true, brinbuildCallbackParallel, state, scan);
/* insert the last item */
form_and_spill_tuple(state);
/* sort the BRIN ranges built by this worker */
- tuplesort_performsort(brinspool->sortstate);
+ tuplesort_performsort(bs_spool->sortstate);
state->bs_reltuples += reltuples;
@@ -2786,7 +2782,7 @@ _brin_parallel_scan_and_build(BrinBuildState *state, BrinSpool *brinspool,
/* Notify leader */
ConditionVariableSignal(&brinshared->workersdonecv);
- tuplesort_end(brinspool->sortstate);
+ tuplesort_end(bs_spool->sortstate);
}
/*
@@ -2805,7 +2801,6 @@ _brin_parallel_build_main(dsm_segment *seg, shm_toc *toc)
LOCKMODE indexLockmode;
WalUsage *walusage;
BufferUsage *bufferusage;
- int sortmem;
/*
* The only possible status flag that can be set to the parallel worker is
@@ -2849,6 +2844,13 @@ _brin_parallel_build_main(dsm_segment *seg, shm_toc *toc)
buildstate->bs_spool->heap = heapRel;
buildstate->bs_spool->index = indexRel;
+ /*
+ * Might as well use reliable figure when doling out maintenance_work_mem
+ * (when requested number of workers were not launched, this will be
+ * somewhat higher than it is for other workers).
+ */
+ buildstate->bs_spool->sortmem = maintenance_work_mem / brinshared->scantuplesortstates;
+
/* Look up shared state private to tuplesort.c */
sharedsort = shm_toc_lookup(toc, PARALLEL_KEY_TUPLESORT, false);
tuplesort_attach_shared(sharedsort, seg);
@@ -2856,16 +2858,9 @@ _brin_parallel_build_main(dsm_segment *seg, shm_toc *toc)
/* Prepare to track buffer usage during parallel execution */
InstrStartParallelQuery();
- /*
- * Might as well use reliable figure when doling out maintenance_work_mem
- * (when requested number of workers were not launched, this will be
- * somewhat higher than it is for other workers).
- */
- sortmem = maintenance_work_mem / brinshared->scantuplesortstates;
-
- _brin_parallel_scan_and_build(buildstate, buildstate->bs_spool,
+ _brin_parallel_scan_and_build(buildstate,
brinshared, sharedsort,
- heapRel, indexRel, sortmem, false);
+ false);
/* Report WAL/buffer usage during parallel execution */
bufferusage = shm_toc_lookup(toc, PARALLEL_KEY_BUFFER_USAGE, false);
diff --git a/src/backend/utils/sort/tuplesortvariants.c b/src/backend/utils/sort/tuplesortvariants.c
index 90fc605f1c..27425880a5 100644
--- a/src/backend/utils/sort/tuplesortvariants.c
+++ b/src/backend/utils/sort/tuplesortvariants.c
@@ -137,16 +137,6 @@ typedef struct
uint32 max_buckets;
} TuplesortIndexHashArg;
-/*
- * Data struture pointed by "TuplesortPublic.arg" for the index_brin subcase.
- */
-typedef struct
-{
- TuplesortIndexArg index;
-
- /* XXX do we need something here? */
-} TuplesortIndexBrinArg;
-
/*
* Data struture pointed by "TuplesortPublic.arg" for the Datum case.
* Set by tuplesort_begin_datum and used only by the DatumTuple routines.
@@ -562,20 +552,13 @@ tuplesort_begin_index_gist(Relation heapRel,
}
Tuplesortstate *
-tuplesort_begin_index_brin(Relation heapRel,
- Relation indexRel,
- int workMem,
+tuplesort_begin_index_brin(int workMem,
SortCoordinate coordinate,
int sortopt)
{
Tuplesortstate *state = tuplesort_begin_common(workMem, coordinate,
sortopt);
TuplesortPublic *base = TuplesortstateGetPublic(state);
- MemoryContext oldcontext;
- TuplesortIndexBrinArg *arg;
-
- oldcontext = MemoryContextSwitchTo(base->maincontext);
- arg = (TuplesortIndexBrinArg *) palloc(sizeof(TuplesortIndexBrinArg));
#ifdef TRACE_SORT
if (trace_sort)
@@ -592,12 +575,7 @@ tuplesort_begin_index_brin(Relation heapRel,
base->writetup = writetup_index_brin;
base->readtup = readtup_index_brin;
base->haveDatum1 = true;
- base->arg = arg;
-
- arg->index.heapRel = heapRel;
- arg->index.indexRel = indexRel;
-
- MemoryContextSwitchTo(oldcontext);
+ base->arg = NULL;
return state;
}
diff --git a/src/include/utils/tuplesort.h b/src/include/utils/tuplesort.h
index 357eb35311..103ced4005 100644
--- a/src/include/utils/tuplesort.h
+++ b/src/include/utils/tuplesort.h
@@ -430,9 +430,7 @@ extern Tuplesortstate *tuplesort_begin_index_gist(Relation heapRel,
Relation indexRel,
int workMem, SortCoordinate coordinate,
int sortopt);
-extern Tuplesortstate *tuplesort_begin_index_brin(Relation heapRel,
- Relation indexRel,
- int workMem, SortCoordinate coordinate,
+extern Tuplesortstate *tuplesort_begin_index_brin(int workMem, SortCoordinate coordinate,
int sortopt);
extern Tuplesortstate *tuplesort_begin_datum(Oid datumType,
Oid sortOperator, Oid sortCollation,
On 12/29/23 12:53, Ranier Vilela wrote:
Em qui., 28 de dez. de 2023 às 22:16, Tomas Vondra
<tomas.vondra@enterprisedb.com <mailto:tomas.vondra@enterprisedb.com>>
escreveu:On 12/27/23 12:37, Ranier Vilela wrote:
Em ter., 26 de dez. de 2023 às 19:07, Tomas Vondra
<tomas.vondra@enterprisedb.com<mailto:tomas.vondra@enterprisedb.com>
<mailto:tomas.vondra@enterprisedb.com
<mailto:tomas.vondra@enterprisedb.com>>>escreveu:
On 12/26/23 19:10, Ranier Vilela wrote:
> Hi,
>
> The commit b437571
<http://b437571714707bc6466abde1a0af5e69aaade09c<http://b437571714707bc6466abde1a0af5e69aaade09c>
<http://b437571714707bc6466abde1a0af5e69aaade09c>>> I
> think has an oversight.
> When allocate memory and initialize private spool in function:
> _brin_leader_participate_as_worker
>
> The behavior is the bs_spool (heap and index fields)
> are left empty.
>
> The code affected is:
> buildstate->bs_spool = (BrinSpool *)palloc0(sizeof(BrinSpool));
> - buildstate->bs_spool->heap = buildstate->bs_spool->heap;
> - buildstate->bs_spool->index = buildstate->bs_spool->index;
> + buildstate->bs_spool->heap = heap;
> + buildstate->bs_spool->index = index;
>
> Is the fix correct?
>Thanks for noticing this.
You're welcome.
Yes, I believe this is a bug - the assignments
are certainly wrong, it leaves the fields set to NULL.I wonder how come this didn't fail during testing. Surely, if
the leader
participates as a worker, the tuplesort_begin_index_brin shall
be called
with heap/index being NULL, leading to some failure during the
sort. But
maybe this means we don't actually need the heap/index fields,
it's just
a copy of TuplesortIndexArg, but BRIN does not need that
because we sort
the tuples by blkno, and we don't need the descriptors for that.
Unfortunately I can't test on Windows, since I can't build with
meson on
Windows.
In any case, the _brin_parallel_scan_and_build does not
actually need
the separate heap/index arguments, those are already in the spool.
Yeah, for sure.
I'll try to figure out if we want to simplify the tuplesort or
remove
the arguments from _brin_parallel_scan_and_build.
Here is a patch simplifying the BRIN parallel create code a little bit.
As I suspected, we don't need the heap/index in the spool at all, and we
don't need to pass it to tuplesort_begin_index_brin either - we only
need blkno, and we have that in the datum1 field. This also means we
don't need TuplesortIndexBrinArg.With Windows 10, msvc 2022, compile end pass ninja test.
But, if you allow me, I would like to try another approach to
simplification.
Instead of increasing the arguments in the call, wouldn't it be better
to decrease them
and this way all arguments will be passed in the registers instead of on
a stack?
If this was beneficial, we'd be passing everything through structs and
not as explicit arguments. But we don't. If you're arguing it's
beneficial in this case, it'd be good to see it demonstrated.
bs_spool may well contain this data and will probably be useful in the
future.I made a v1 version, based on your patch, for your consideration.
I did actually consider doing it this way yesterday, but I don't like
this approach. I don't believe it's faster (and even if it was, the
difference is going to be negligible), and parameters hidden in some
struct increase the cognitive load. I like explicit arguments.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Em sex., 29 de dez. de 2023 às 10:33, Tomas Vondra <
tomas.vondra@enterprisedb.com> escreveu:
On 12/29/23 12:53, Ranier Vilela wrote:
Em qui., 28 de dez. de 2023 às 22:16, Tomas Vondra
<tomas.vondra@enterprisedb.com <mailto:tomas.vondra@enterprisedb.com>>
escreveu:On 12/27/23 12:37, Ranier Vilela wrote:
Em ter., 26 de dez. de 2023 às 19:07, Tomas Vondra
<tomas.vondra@enterprisedb.com<mailto:tomas.vondra@enterprisedb.com>
<mailto:tomas.vondra@enterprisedb.com
<mailto:tomas.vondra@enterprisedb.com>>>escreveu:
On 12/26/23 19:10, Ranier Vilela wrote:
Hi,
The commit b437571
<http://b437571714707bc6466abde1a0af5e69aaade09c>
<http://b437571714707bc6466abde1a0af5e69aaade09c>>> I
think has an oversight.
When allocate memory and initialize private spool infunction:
_brin_leader_participate_as_worker
The behavior is the bs_spool (heap and index fields)
are left empty.The code affected is:
buildstate->bs_spool = (BrinSpool *)palloc0(sizeof(BrinSpool));
- buildstate->bs_spool->heap = buildstate->bs_spool->heap; - buildstate->bs_spool->index = buildstate->bs_spool->index; + buildstate->bs_spool->heap = heap; + buildstate->bs_spool->index = index;Is the fix correct?
Thanks for noticing this.
You're welcome.
Yes, I believe this is a bug - the assignments
are certainly wrong, it leaves the fields set to NULL.I wonder how come this didn't fail during testing. Surely, if
the leader
participates as a worker, the tuplesort_begin_index_brin shall
be called
with heap/index being NULL, leading to some failure during the
sort. But
maybe this means we don't actually need the heap/index fields,
it's just
a copy of TuplesortIndexArg, but BRIN does not need that
because we sort
the tuples by blkno, and we don't need the descriptors for
that.
Unfortunately I can't test on Windows, since I can't build with
meson on
Windows.
In any case, the _brin_parallel_scan_and_build does not
actually need
the separate heap/index arguments, those are already in the
spool.
Yeah, for sure.
I'll try to figure out if we want to simplify the tuplesort or
remove
the arguments from _brin_parallel_scan_and_build.
Here is a patch simplifying the BRIN parallel create code a little
bit.
As I suspected, we don't need the heap/index in the spool at all,
and we
don't need to pass it to tuplesort_begin_index_brin either - we only
need blkno, and we have that in the datum1 field. This also means we
don't need TuplesortIndexBrinArg.With Windows 10, msvc 2022, compile end pass ninja test.
But, if you allow me, I would like to try another approach to
simplification.
Instead of increasing the arguments in the call, wouldn't it be better
to decrease them
and this way all arguments will be passed in the registers instead of on
a stack?If this was beneficial, we'd be passing everything through structs and
not as explicit arguments. But we don't. If you're arguing it's
beneficial in this case, it'd be good to see it demonstrated.
Please see the https://www.agner.org/optimize/optimizing_cpp.pdf
Excerpt:
"Use 64-bit mode
Parameter transfer is more efficient in 64-bit mode than in 32-bit mode,
and more efficient in 64-bit Linux than in 64-bit Windows. In 64-bit Linux,
the first six integer parameters and the first eight floating point
parameters are transferred in registers, totaling up to fourteen register
parameters. In 64-bit Windows, the first four parameters are transferred in
registers, regardless of whether they are integers or floating point
numbers."
With function:
_brin_parallel_scan_and_build(buildstate, buildstate->bs_spool,
brinshared, sharedsort, heapRel, indexRel, sortmem, false);
We have:
Linux -> six first parameters in registers and two parameters in stack
Windows -> four parameters in registers and four parameters in stack
bs_spool may well contain this data and will probably be useful in the
future.I made a v1 version, based on your patch, for your consideration.
I did actually consider doing it this way yesterday, but I don't like
this approach. I don't believe it's faster (and even if it was, the
difference is going to be negligible), and parameters hidden in some
struct increase the cognitive load. I like explicit arguments.
Personally I prefer data in structs, of course,
always thinking about size and alignment, to optimize loading.
Best regards,
Ranier Vilela
On 12/29/23 14:53, Ranier Vilela wrote:
Em sex., 29 de dez. de 2023 às 10:33, Tomas Vondra
<tomas.vondra@enterprisedb.com <mailto:tomas.vondra@enterprisedb.com>>
escreveu:On 12/29/23 12:53, Ranier Vilela wrote:
Em qui., 28 de dez. de 2023 às 22:16, Tomas Vondra
<tomas.vondra@enterprisedb.com<mailto:tomas.vondra@enterprisedb.com>
<mailto:tomas.vondra@enterprisedb.com
<mailto:tomas.vondra@enterprisedb.com>>>escreveu:
On 12/27/23 12:37, Ranier Vilela wrote:
> Em ter., 26 de dez. de 2023 às 19:07, Tomas Vondra
> <tomas.vondra@enterprisedb.com<mailto:tomas.vondra@enterprisedb.com>
<mailto:tomas.vondra@enterprisedb.com
<mailto:tomas.vondra@enterprisedb.com>>
<mailto:tomas.vondra@enterprisedb.com
<mailto:tomas.vondra@enterprisedb.com>
<mailto:tomas.vondra@enterprisedb.com
<mailto:tomas.vondra@enterprisedb.com>>>>
> escreveu:
>
>
>
> On 12/26/23 19:10, Ranier Vilela wrote:
> > Hi,
> >
> > The commit b437571
> <http://b437571714707bc6466abde1a0af5e69aaade09c<http://b437571714707bc6466abde1a0af5e69aaade09c>
<http://b437571714707bc6466abde1a0af5e69aaade09c>>
<http://b437571714707bc6466abde1a0af5e69aaade09c>
<http://b437571714707bc6466abde1a0af5e69aaade09c>>>> I
> > think has an oversight.
> > When allocate memory and initialize private spool infunction:
> > _brin_leader_participate_as_worker
> >
> > The behavior is the bs_spool (heap and index fields)
> > are left empty.
> >
> > The code affected is:
> > buildstate->bs_spool = (BrinSpool *)
palloc0(sizeof(BrinSpool));
> > - buildstate->bs_spool->heap = buildstate->bs_spool->heap;
> > - buildstate->bs_spool->index =buildstate->bs_spool->index;
> > + buildstate->bs_spool->heap = heap;
> > + buildstate->bs_spool->index = index;
> >
> > Is the fix correct?
> >
>
> Thanks for noticing this.
>
> You're welcome.
>
>
> Yes, I believe this is a bug - the assignments
> are certainly wrong, it leaves the fields set to NULL.
>
> I wonder how come this didn't fail during testing.Surely, if
the leader
> participates as a worker, the tuplesort_begin_index_brinshall
be called
> with heap/index being NULL, leading to some failureduring the
sort. But
> maybe this means we don't actually need the heap/indexfields,
it's just
> a copy of TuplesortIndexArg, but BRIN does not need that
because we sort
> the tuples by blkno, and we don't need the descriptorsfor that.
>
> Unfortunately I can't test on Windows, since I can't build with
meson on
> Windows.
>
>
> In any case, the _brin_parallel_scan_and_build does not
actually need
> the separate heap/index arguments, those are already inthe spool.
>
> Yeah, for sure.
>
>
> I'll try to figure out if we want to simplify thetuplesort or
remove
> the arguments from _brin_parallel_scan_and_build.
>Here is a patch simplifying the BRIN parallel create code a
little bit.
As I suspected, we don't need the heap/index in the spool at
all, and we
don't need to pass it to tuplesort_begin_index_brin either -
we only
need blkno, and we have that in the datum1 field. This also
means we
don't need TuplesortIndexBrinArg.
With Windows 10, msvc 2022, compile end pass ninja test.
But, if you allow me, I would like to try another approach to
simplification.
Instead of increasing the arguments in the call, wouldn't it be better
to decrease them
and this way all arguments will be passed in the registers insteadof on
a stack?
If this was beneficial, we'd be passing everything through structs and
not as explicit arguments. But we don't. If you're arguing it's
beneficial in this case, it'd be good to see it demonstrated.Please see the https://www.agner.org/optimize/optimizing_cpp.pdf
<https://www.agner.org/optimize/optimizing_cpp.pdf>
Excerpt:
"Use 64-bit mode
Parameter transfer is more efficient in 64-bit mode than in 32-bit mode,
and more efficient in 64-bit Linux than in 64-bit Windows. In 64-bit
Linux, the first six integer parameters and the first eight floating
point parameters are transferred in registers, totaling up to fourteen
register parameters. In 64-bit Windows, the first four parameters are
transferred in registers, regardless of whether they are integers or
floating point numbers."With function:
_brin_parallel_scan_and_build(buildstate, buildstate->bs_spool,
brinshared, sharedsort, heapRel, indexRel, sortmem, false);
We have:
Linux -> six first parameters in registers and two parameters in stack
Windows -> four parameters in registers and four parameters in stack
I suggested you demonstrate this actually makes a difference in
practice. Quoting a document is not that.
Also, that document is about C++, and while C and C++ are very close, I
wouldn't be surprised if there were differences. Furthermore, that
section talks about integer/floating point arguments, while we're
dealing with pointers, and it's not clear if that changes something (the
document has a separate section about pointers/references, which
suggests pointers and integers are not 100% the same thing).
And finally, I haven't tried disassembling the code, but I'd be quite
surprised if these things were not heavily dependent on the compiler
and/or optimization level.
bs_spool may well contain this data and will probably be useful in the
future.I made a v1 version, based on your patch, for your consideration.
I did actually consider doing it this way yesterday, but I don't like
this approach. I don't believe it's faster (and even if it was, the
difference is going to be negligible), and parameters hidden in some
struct increase the cognitive load. I like explicit arguments.Personally I prefer data in structs, of course,
always thinking about size and alignment, to optimize loading.
As I said, I think this is quite irrelevant because we'll call the
function maybe 10-times during the whole index build. With millions of
other function calls.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Em sex., 29 de dez. de 2023 às 08:53, Ranier Vilela <ranier.vf@gmail.com>
escreveu:
Em qui., 28 de dez. de 2023 às 22:16, Tomas Vondra <
tomas.vondra@enterprisedb.com> escreveu:On 12/27/23 12:37, Ranier Vilela wrote:
Em ter., 26 de dez. de 2023 às 19:07, Tomas Vondra
<tomas.vondra@enterprisedb.com <mailto:tomas.vondra@enterprisedb.com>>
escreveu:On 12/26/23 19:10, Ranier Vilela wrote:
Hi,
The commit b437571
<http://b437571714707bc6466abde1a0af5e69aaade09c
<http://b437571714707bc6466abde1a0af5e69aaade09c>> Ithink has an oversight.
When allocate memory and initialize private spool in function:
_brin_leader_participate_as_workerThe behavior is the bs_spool (heap and index fields)
are left empty.The code affected is: buildstate->bs_spool = (BrinSpool *) palloc0(sizeof(BrinSpool)); - buildstate->bs_spool->heap = buildstate->bs_spool->heap; - buildstate->bs_spool->index = buildstate->bs_spool->index; + buildstate->bs_spool->heap = heap; + buildstate->bs_spool->index = index;Is the fix correct?
Thanks for noticing this.
You're welcome.
Yes, I believe this is a bug - the assignments
are certainly wrong, it leaves the fields set to NULL.I wonder how come this didn't fail during testing. Surely, if the
leader
participates as a worker, the tuplesort_begin_index_brin shall be
called
with heap/index being NULL, leading to some failure during the
sort. But
maybe this means we don't actually need the heap/index fields, it's
just
a copy of TuplesortIndexArg, but BRIN does not need that because we
sort
the tuples by blkno, and we don't need the descriptors for that.
Unfortunately I can't test on Windows, since I can't build with meson on
Windows.In any case, the _brin_parallel_scan_and_build does not actually
need
the separate heap/index arguments, those are already in the spool.
Yeah, for sure.
I'll try to figure out if we want to simplify the tuplesort or
remove
the arguments from _brin_parallel_scan_and_build.
Here is a patch simplifying the BRIN parallel create code a little bit.
As I suspected, we don't need the heap/index in the spool at all, and we
don't need to pass it to tuplesort_begin_index_brin either - we only
need blkno, and we have that in the datum1 field. This also means we
don't need TuplesortIndexBrinArg.With Windows 10, msvc 2022, compile end pass ninja test.
But, if you allow me, I would like to try another approach to
simplification.
Instead of increasing the arguments in the call, wouldn't it be better to
decrease them
and this way all arguments will be passed in the registers instead of on a
stack?bs_spool may well contain this data and will probably be useful in the
future.I made a v1 version, based on your patch, for your consideration.
As I wrote, the new patch version was for consideration.
It seems more like a question of style, so it's better to remove it.
Anyway +1 for your original patch.
Best regards,
Ranier Vilela
On 12/29/23 18:02, Ranier Vilela wrote:
...
As I wrote, the new patch version was for consideration.
It seems more like a question of style, so it's better to remove it.Anyway +1 for your original patch.
I've pushed my original patch. Thanks for the report.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Em sáb., 30 de dez. de 2023 19:19, Tomas Vondra <
tomas.vondra@enterprisedb.com> escreveu:
On 12/29/23 18:02, Ranier Vilela wrote:
...
As I wrote, the new patch version was for consideration.
It seems more like a question of style, so it's better to remove it.Anyway +1 for your original patch.
I've pushed my original patch. Thanks for the report.
Thank you.
Best regards,
Ranier Vilela