Parallel Bitmap scans a bit broken
I was just doing some testing on [1]/messages/by-id/attachment/50164/brin-correlation-v3.patch when I noticed that there's a problem
with parallel bitmap index scans scans.
Test case:
patch with [1]/messages/by-id/attachment/50164/brin-correlation-v3.patch
=# create table r1(value int);
CREATE TABLE
=# insert into r1 select (random()*1000)::int from
generate_Series(1,1000000);
INSERT 0 1000000
=# create index on r1 using brin(value);
CREATE INDEX
=# set enable_seqscan=0;
SET
=# explain select * from r1 where value=555;
QUERY PLAN
-----------------------------------------------------------------------------------------
Gather (cost=3623.52..11267.45 rows=5000 width=4)
Workers Planned: 2
-> Parallel Bitmap Heap Scan on r1 (cost=2623.52..9767.45 rows=2083
width=4)
Recheck Cond: (value = 555)
-> Bitmap Index Scan on r1_value_idx (cost=0.00..2622.27
rows=522036 width=0)
Index Cond: (value = 555)
(6 rows)
=# explain analyze select * from r1 where value=555;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
The crash occurs in tbm_shared_iterate() at:
PagetableEntry *page = &ptbase[idxpages[istate->spageptr]];
I see in tbm_prepare_shared_iterate() tbm->npages is zero. I'm unsure if
bringetbitmap() does something different with npages than btgetbitmap()
around setting npages?
But anyway, due to the npages being 0 the tbm->ptpages is not allocated
in tbm_prepare_shared_iterate()
if (tbm->npages)
{
tbm->ptpages = dsa_allocate(tbm->dsa, sizeof(PTIterationArray) +
tbm->npages * sizeof(int));
so when tbm_shared_iterate runs this code;
/*
* If both chunk and per-page data remain, must output the numerically
* earlier page.
*/
if (istate->schunkptr < istate->nchunks)
{
PagetableEntry *chunk = &ptbase[idxchunks[istate->schunkptr]];
PagetableEntry *page = &ptbase[idxpages[istate->spageptr]];
BlockNumber chunk_blockno;
chunk_blockno = chunk->blockno + istate->schunkbit;
if (istate->spageptr >= istate->npages ||
chunk_blockno < page->blockno)
{
/* Return a lossy page indicator from the chunk */
output->blockno = chunk_blockno;
output->ntuples = -1;
output->recheck = true;
istate->schunkbit++;
LWLockRelease(&istate->lock);
return output;
}
}
it fails, due to idxpages pointing to random memory
Probably this is a simple fix for the authors, so passing it along. I'm a
bit unable to see how the part above is meant to work.
[1]: /messages/by-id/attachment/50164/brin-correlation-v3.patch
/messages/by-id/attachment/50164/brin-correlation-v3.patch
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Thu, Mar 9, 2017 at 9:17 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
patch with [1]
=# create table r1(value int);
CREATE TABLE
=# insert into r1 select (random()*1000)::int from
generate_Series(1,1000000);
INSERT 0 1000000
=# create index on r1 using brin(value);
CREATE INDEX
=# set enable_seqscan=0;
SET
=# explain select * from r1 where value=555;
I am looking into the issue, I have already reproduced it. I will
update on this soon.
Thanks for reporting.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Mar 9, 2017 at 9:37 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
=# create table r1(value int);
CREATE TABLE
=# insert into r1 select (random()*1000)::int from
generate_Series(1,1000000);
INSERT 0 1000000
=# create index on r1 using brin(value);
CREATE INDEX
=# set enable_seqscan=0;
SET
=# explain select * from r1 where value=555;I am looking into the issue, I have already reproduced it. I will
update on this soon.Thanks for reporting.
I slightly modified your query to reproduce this issue.
explain analyze select * from r1 where value<555;
Patch is attached to fix the problem.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Attachments:
parallel_bitmap_fix.patchapplication/octet-stream; name=parallel_bitmap_fix.patchDownload
diff --git a/src/backend/nodes/tidbitmap.c b/src/backend/nodes/tidbitmap.c
index 44cc9da..8e91564 100644
--- a/src/backend/nodes/tidbitmap.c
+++ b/src/backend/nodes/tidbitmap.c
@@ -1103,13 +1103,12 @@ tbm_shared_iterate(TBMSharedIterator *iterator)
if (istate->schunkptr < istate->nchunks)
{
PagetableEntry *chunk = &ptbase[idxchunks[istate->schunkptr]];
- PagetableEntry *page = &ptbase[idxpages[istate->spageptr]];
BlockNumber chunk_blockno;
chunk_blockno = chunk->blockno + istate->schunkbit;
if (istate->spageptr >= istate->npages ||
- chunk_blockno < page->blockno)
+ chunk_blockno < ptbase[idxpages[istate->spageptr]].blockno)
{
/* Return a lossy page indicator from the chunk */
output->blockno = chunk_blockno;
On Thu, Mar 9, 2017 at 10:02 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
I slightly modified your query to reproduce this issue.
explain analyze select * from r1 where value<555;
Patch is attached to fix the problem.
I forgot to mention the cause of the problem.
if (istate->schunkptr < istate->nchunks)
{
PagetableEntry *chunk = &ptbase[idxchunks[istate->schunkptr]];
PagetableEntry *page = &ptbase[idxpages[istate->spageptr]];
BlockNumber chunk_blockno;
In above if condition we have only checked istate->schunkptr <
istate->nchunks that means we have some chunk left so we are safe to
access idxchunks, But just after that we are accessing
ptbase[idxpages[istate->spageptr]] without checking that accessing
idxpages is safe or not.
tbm_iterator already handling this case, I broke it in tbm_shared_iterator.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Mar 9, 2017 at 11:50 AM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Thu, Mar 9, 2017 at 10:02 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
I slightly modified your query to reproduce this issue.
explain analyze select * from r1 where value<555;
Patch is attached to fix the problem.
I forgot to mention the cause of the problem.
if (istate->schunkptr < istate->nchunks)
{
PagetableEntry *chunk = &ptbase[idxchunks[istate->schunkptr]];
PagetableEntry *page = &ptbase[idxpages[istate->spageptr]];
BlockNumber chunk_blockno;In above if condition we have only checked istate->schunkptr <
istate->nchunks that means we have some chunk left so we are safe to
access idxchunks, But just after that we are accessing
ptbase[idxpages[istate->spageptr]] without checking that accessing
idxpages is safe or not.tbm_iterator already handling this case, I broke it in tbm_shared_iterator.
I don't know if this is the only problem -- it would be good if David
could retest -- but it's certainly *a* problem, so committed.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 10 March 2017 at 06:17, Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Mar 9, 2017 at 11:50 AM, Dilip Kumar <dilipbalaut@gmail.com>
wrote:On Thu, Mar 9, 2017 at 10:02 PM, Dilip Kumar <dilipbalaut@gmail.com>
wrote:
I slightly modified your query to reproduce this issue.
explain analyze select * from r1 where value<555;
Patch is attached to fix the problem.
I forgot to mention the cause of the problem.
if (istate->schunkptr < istate->nchunks)
{
PagetableEntry *chunk = &ptbase[idxchunks[istate->schunkptr]];
PagetableEntry *page = &ptbase[idxpages[istate->spageptr]];
BlockNumber chunk_blockno;In above if condition we have only checked istate->schunkptr <
istate->nchunks that means we have some chunk left so we are safe to
access idxchunks, But just after that we are accessing
ptbase[idxpages[istate->spageptr]] without checking that accessing
idxpages is safe or not.tbm_iterator already handling this case, I broke it in
tbm_shared_iterator.
I don't know if this is the only problem -- it would be good if David
could retest -- but it's certainly *a* problem, so committed.
Thanks for committing, and generally parallelising more stuff.
I confirm that my test case is now working again.
I'll be in this general area today, so will mention if I stumble over
anything that looks broken.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
I don't know if this is the only problem
I'll be in this general area today, so will mention if I stumble over
anything that looks broken.
I was testing the same patch with a large dataset and got a different segfault:
hasegeli=# explain select * from only mp_notification_20170225 where server_id = 7;
QUERY PLAN
----------------------------------------------------------------------------------------------------------
Gather (cost=26682.94..476995.88 rows=1 width=215)
Workers Planned: 2
-> Parallel Bitmap Heap Scan on mp_notification_20170225 (cost=25682.94..475995.78 rows=1 width=215)
Recheck Cond: (server_id = 7)
-> Bitmap Index Scan on mp_notification_block_idx (cost=0.00..25682.94 rows=4557665 width=0)
Index Cond: (server_id = 7)
(6 rows)hasegeli=# select * from only mp_notification_20170225 where server_id = 7;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
* thread #1: tid = 0x5045a8f, 0x000000010ae44558 postgres`brin_deform_tuple(brdesc=0x00007fea3c86a3a8, tuple=0x00007fea3c891040) + 40 at brin_tuple.c:414, queue = 'com.apple.main-thread', stop reason = signal SIGUSR1
* frame #0: 0x000000010ae44558 postgres`brin_deform_tuple(brdesc=0x00007fea3c86a3a8, tuple=0x00007fea3c891040) + 40 at brin_tuple.c:414 [opt]
frame #1: 0x000000010ae4000c postgres`bringetbitmap(scan=0x00007fea3c875c20, tbm=<unavailable>) + 428 at brin.c:398 [opt]
frame #2: 0x000000010ae9b451 postgres`index_getbitmap(scan=0x00007fea3c875c20, bitmap=<unavailable>) + 65 at indexam.c:726 [opt]
frame #3: 0x000000010b0035a9 postgres`MultiExecBitmapIndexScan(node=<unavailable>) + 233 at nodeBitmapIndexscan.c:91 [opt]
frame #4: 0x000000010b002840 postgres`BitmapHeapNext(node=<unavailable>) + 400 at nodeBitmapHeapscan.c:143 [opt]
frame #5: 0x000000010afef5d0 postgres`ExecProcNode(node=0x00007fea3c873948) + 224 at execProcnode.c:459 [opt]
frame #6: 0x000000010b004cc9 postgres`ExecGather [inlined] gather_getnext(gatherstate=<unavailable>) + 520 at nodeGather.c:276 [opt]
frame #7: 0x000000010b004ac1 postgres`ExecGather(node=<unavailable>) + 497 at nodeGather.c:212 [opt]
frame #8: 0x000000010afef6b2 postgres`ExecProcNode(node=0x00007fea3c872f58) + 450 at execProcnode.c:541 [opt]
frame #9: 0x000000010afeaf90 postgres`standard_ExecutorRun [inlined] ExecutePlan(estate=<unavailable>, planstate=<unavailable>, use_parallel_mode=<unavailable>, operation=<unavailable>, numberTuples=0, direction=<unavailable>, dest=<unavailable>) + 29 at execMain.c:1616 [opt]
frame #10: 0x000000010afeaf73 postgres`standard_ExecutorRun(queryDesc=<unavailable>, direction=<unavailable>, count=0) + 291 at execMain.c:348 [opt]
frame #11: 0x000000010af8b108 postgres`ExplainOnePlan(plannedstmt=0x00007fea3c871040, into=0x0000000000000000, es=0x00007fea3c805360, queryString=<unavailable>, params=<unavailable>, planduration=<unavailable>) + 328 at explain.c:533 [opt]
frame #12: 0x000000010af8ab98 postgres`ExplainOneQuery(query=0x00007fea3c805890, cursorOptions=<unavailable>, into=0x0000000000000000, es=0x00007fea3c805360, queryString=<unavailable>,params=0x0000000000000000) + 280 at explain.c:369 [opt]
frame #13: 0x000000010af8a773 postgres`ExplainQuery(pstate=<unavailable>, stmt=0x00007fea3d005450, queryString="explain analyze select * from only mp_notification_20170225 where server_id > 6;",params=0x0000000000000000, dest=0x00007fea3c8052c8) + 819 at explain.c:254 [opt]
frame #14: 0x000000010b13b660 postgres`standard_ProcessUtility(pstmt=0x00007fea3d005fa8, queryString="explain analyze select * from only mp_notification_20170225 where server_id > 6;",context=PROCESS_UTILITY_TOPLEVEL, params=0x0000000000000000, dest=0x00007fea3c8052c8, completionTag=<unavailable>) + 1104 at utility.c:675 [opt]
frame #15: 0x000000010b13ad2a postgres`PortalRunUtility(portal=0x00007fea3c837640, pstmt=0x00007fea3d005fa8, isTopLevel='\x01', setHoldSnapshot=<unavailable>, dest=0x00007fea3c8052c8, completionTag=<unavailable>) + 90 at pquery.c:1165 [opt]
frame #16: 0x000000010b139f56 postgres`FillPortalStore(portal=0x00007fea3c837640, isTopLevel='\x01') + 182 at pquery.c:1025 [opt]
frame #17: 0x000000010b139c22 postgres`PortalRun(portal=0x00007fea3c837640, count=<unavailable>, isTopLevel='\x01', dest=<unavailable>, altdest=<unavailable>, completionTag=<unavailable>) + 402 at pquery.c:757 [opt]
frame #18: 0x000000010b13789b postgres`PostgresMain + 44 at postgres.c:1101 [opt]
frame #19: 0x000000010b13786f postgres`PostgresMain(argc=<unavailable>, argv=<unavailable>, dbname=<unavailable>, username=<unavailable>) + 8927 at postgres.c:4066 [opt]
frame #20: 0x000000010b0ba113 postgres`PostmasterMain [inlined] BackendRun + 7587 at postmaster.c:4317 [opt]
frame #21: 0x000000010b0ba0e8 postgres`PostmasterMain [inlined] BackendStartup at postmaster.c:3989 [opt]
frame #22: 0x000000010b0ba0e8 postgres`PostmasterMain at postmaster.c:1729 [opt]
frame #23: 0x000000010b0ba0e8 postgres`PostmasterMain(argc=<unavailable>, argv=<unavailable>) + 7544 at postmaster.c:1337 [opt]
frame #24: 0x000000010b0332af postgres`main(argc=<unavailable>, argv=<unavailable>) + 1567 at main.c:228 [opt]
frame #25: 0x00007fffb4e28255 libdyld.dylib`start + 1
I can try to provide a test case, if that wouldn't be enough to spot
the problem.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Mar 15, 2017 at 8:11 PM, Emre Hasegeli <emre@hasegeli.com> wrote:
I can try to provide a test case, if that wouldn't be enough to spot
the problem.
Thanks for reporting, I am looking into this. Meanwhile, if you can
provide the reproducible test case then locating the issue will be
faster.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Mar 15, 2017 at 8:51 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
I can try to provide a test case, if that wouldn't be enough to spot
the problem.Thanks for reporting, I am looking into this. Meanwhile, if you can
provide the reproducible test case then locating the issue will be
faster.
After trying multiple attempts with different datasets I am unable to
reproduce the issue.
I tried with below test case:
create table t(a int, b varchar);
insert into t values(generate_series(1,10000000), repeat('x', 100));
insert into t values(generate_series(1,100000000), repeat('x', 100));
create index idx on t using brin(a);
postgres=# analyze t;
ANALYZE
postgres=# explain analyze select * from t where a>6;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------
Gather (cost=580794.52..3059826.52 rows=110414922 width=105) (actual
time=92.324..91853.716 rows=110425971 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Bitmap Heap Scan on t (cost=579794.52..3058826.52
rows=46006218 width=105) (actual time=65.651..62023.020 rows=36808657
loops=3)
Recheck Cond: (a > 6)
Rows Removed by Index Recheck: 4
Heap Blocks: lossy=204401
-> Bitmap Index Scan on idx (cost=0.00..552190.79
rows=110425920 width=0) (actual time=88.215..88.215 rows=19040000
loops=1)
Index Cond: (a > 6)
Planning time: 1.116 ms
Execution time: 96176.881 ms
(11 rows)
Is it possible for you to provide a reproducible test case? I also
applied the patch given up thread[1]/messages/by-id/attachment/50164/brin-correlation-v3.patch but still could not reproduce.
[1]: /messages/by-id/attachment/50164/brin-correlation-v3.patch
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Mar 15, 2017 at 8:11 PM, Emre Hasegeli <emre@hasegeli.com> wrote:
* thread #1: tid = 0x5045a8f, 0x000000010ae44558 postgres`brin_deform_tuple(brdesc=0x00007fea3c86a3a8, tuple=0x00007fea3c891040) + 40 at brin_tuple.c:414, queue = 'com.apple.main-thread', stop reason = signal SIGUSR1
* frame #0: 0x000000010ae44558 postgres`brin_deform_tuple(brdesc=0x00007fea3c86a3a8, tuple=0x00007fea3c891040) + 40 at brin_tuple.c:414 [opt]
frame #1: 0x000000010ae4000c postgres`bringetbitmap(scan=0x00007fea3c875c20, tbm=<unavailable>) + 428 at brin.c:398 [opt]
frame #2: 0x000000010ae9b451 postgres`index_getbitmap(scan=0x00007fea3c875c20, bitmap=<unavailable>) + 65 at indexam.c:726 [opt]
frame #3: 0x000000010b0035a9 postgres`MultiExecBitmapIndexScan(node=<unavailable>) + 233 at nodeBitmapIndexscan.c:91 [opt]
frame #4: 0x000000010b002840 postgres`BitmapHeapNext(node=<unavailable>) + 400 at nodeBitmapHeapscan.c:143 [opt]
Further analyzing the call stack, seems like this is not exact call
stack where it crashed. Because, if you notice the code in the
brin_deform_tuple (line 414)
brin_deform_tuple(BrinDesc *brdesc, BrinTuple *tuple)
{
dtup = brin_new_memtuple(brdesc);
if (BrinTupleIsPlaceholder(tuple))
dtup->bt_placeholder = true;
dtup->bt_blkno = tuple->bt_blkno; --> line 414
This can crash at line:414, if either tuple is invalid memory(but I
think it's not because we have already accessed this memory in above
if check) or dtup is invalid (this is also not possible because
brin_new_memtuple has already accessed this).
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
This can crash at line:414, if either tuple is invalid memory(but I
think it's not because we have already accessed this memory in above
if check) or dtup is invalid (this is also not possible because
brin_new_memtuple has already accessed this).
I was testing with the brin correlation patch [1]/messages/by-id/attachment/50164/brin-correlation-v3.patch applied. I cannot
crash it without the patch either. I am sorry for not testing it
before. The patch make BRIN selectivity estimation function access
more information.
[1]: /messages/by-id/attachment/50164/brin-correlation-v3.patch
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Mar 15, 2017 at 10:02 PM, Emre Hasegeli <emre@hasegeli.com> wrote:
I was testing with the brin correlation patch [1] applied. I cannot
crash it without the patch either. I am sorry for not testing it
before. The patch make BRIN selectivity estimation function access
more information.[1] /messages/by-id/attachment/50164/brin-correlation-v3.patch
With my test case, I could not crash even with this patch applied.
Can you provide your test case?
(table, index, data, query)
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
With my test case, I could not crash even with this patch applied.
Can you provide your test case?
Yes:
hasegeli=# create table r2 as select (random() * 3)::int as i from generate_series(1, 1000000);
SELECT 1000000
hasegeli=# create index on r2 using brin (i);
CREATE INDEX
hasegeli=# analyze r2;
ANALYZE
hasegeli=# explain select * from only r2 where i = 10;
QUERY PLAN
-------------------------------------------------------------------------------------
Gather (cost=2867.50..9225.32 rows=1 width=4)
Workers Planned: 2
-> Parallel Bitmap Heap Scan on r2 (cost=1867.50..8225.22 rows=1 width=4)
Recheck Cond: (i = 10)
-> Bitmap Index Scan on r2_i_idx (cost=0.00..1867.50 rows=371082 width=0)
Index Cond: (i = 10)
(6 rows)hasegeli=# select * from only r2 where i = 10;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Mar 15, 2017 at 10:21 PM, Emre Hasegeli <emre@hasegeli.com> wrote:
hasegeli=# create table r2 as select (random() * 3)::int as i from generate_series(1, 1000000);
SELECT 1000000
hasegeli=# create index on r2 using brin (i);
CREATE INDEX
hasegeli=# analyze r2;
ANALYZE
hasegeli=# explain select * from only r2 where i = 10;
QUERY PLAN
-------------------------------------------------------------------------------------
Gather (cost=2867.50..9225.32 rows=1 width=4)
Workers Planned: 2
-> Parallel Bitmap Heap Scan on r2 (cost=1867.50..8225.22 rows=1 width=4)
Recheck Cond: (i = 10)
-> Bitmap Index Scan on r2_i_idx (cost=0.00..1867.50 rows=371082 width=0)
Index Cond: (i = 10)
(6 rows)hasegeli=# select * from only r2 where i = 10;
I am able to reproduce the bug, and attached patch fixes the same.
Problem is that I am not handling TBM_EMPTY state properly. I
remember that while reviewing the patch Robert mentioned that we might
need to handle the TBM_EMPTY and I told that since we are not handling
in non-parallel mode so we don't need to handle here as well. But, I
was wrong. So the problem is that if state is not TBM_HASH then it's
directly assuming TBM_ONE_PAGE which is completely wrong. I have
fixed that and also fixed in other similar locations.
Please verify the fix.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Attachments:
fix_tbm_empty.patchapplication/octet-stream; name=fix_tbm_empty.patchDownload
diff --git a/src/backend/nodes/tidbitmap.c b/src/backend/nodes/tidbitmap.c
index 8e91564..970635e 100644
--- a/src/backend/nodes/tidbitmap.c
+++ b/src/backend/nodes/tidbitmap.c
@@ -363,13 +363,16 @@ void
tbm_free_shared_area(dsa_area *dsa, dsa_pointer dp)
{
TBMSharedIteratorState *istate = dsa_get_address(dsa, dp);
- PTEntryArray *ptbase = dsa_get_address(dsa, istate->pagetable);
+ PTEntryArray *ptbase;
PTIterationArray *ptpages;
PTIterationArray *ptchunks;
- if (pg_atomic_sub_fetch_u32(&ptbase->refcount, 1) == 0)
- dsa_free(dsa, istate->pagetable);
-
+ if (istate->pagetable)
+ {
+ ptbase = dsa_get_address(dsa, istate->pagetable);
+ if (pg_atomic_sub_fetch_u32(&ptbase->refcount, 1) == 0)
+ dsa_free(dsa, istate->pagetable);
+ }
if (istate->spages)
{
ptpages = dsa_get_address(dsa, istate->spages);
@@ -856,7 +859,7 @@ tbm_prepare_shared_iterate(TIDBitmap *tbm)
Assert(npages == tbm->npages);
Assert(nchunks == tbm->nchunks);
}
- else
+ else if (tbm->status == TBM_ONE_PAGE)
{
/*
* In one page mode allocate the space for one pagetable entry and
@@ -868,8 +871,8 @@ tbm_prepare_shared_iterate(TIDBitmap *tbm)
ptpages->index[0] = 0;
}
- pg_atomic_init_u32(&ptbase->refcount, 0);
-
+ if (ptbase)
+ pg_atomic_init_u32(&ptbase->refcount, 0);
if (npages > 1)
qsort_arg((void *) (ptpages->index), npages, sizeof(int),
tbm_shared_comparator, (void *) ptbase->ptentry);
@@ -899,7 +902,8 @@ tbm_prepare_shared_iterate(TIDBitmap *tbm)
* increase the refcount by 1 so that while freeing the shared iterator
* we don't free pagetable and iterator array until its refcount becomes 0.
*/
- pg_atomic_add_fetch_u32(&ptbase->refcount, 1);
+ if (ptbase)
+ pg_atomic_add_fetch_u32(&ptbase->refcount, 1);
if (ptpages)
pg_atomic_add_fetch_u32(&ptpages->refcount, 1);
if (ptchunks)
Please verify the fix.
The same test with both of the patches applied still crashes for me.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Mar 16, 2017 at 12:56 AM, Emre Hasegeli <emre@hasegeli.com> wrote:
Please verify the fix.
The same test with both of the patches applied still crashes for me.
After above fix, I am not able to reproduce. Can you give me the
backtrace of the crash location or the dump?
I am trying on the below commit
commit c5832346625af4193b1242e57e7d13e66a220b38
Author: Stephen Frost <sfrost@snowman.net>
Date: Wed Mar 15 11:19:39 2017 -0400
+ https://www.postgresql.org/message-id/attachment/50164/brin-correlation-v3.patch
+ fix_tbm_empty.patch
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Mar 16, 2017 at 5:02 AM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
After above fix, I am not able to reproduce. Can you give me the
backtrace of the crash location or the dump?I am trying on the below commit
commit c5832346625af4193b1242e57e7d13e66a220b38
Author: Stephen Frost <sfrost@snowman.net>
Date: Wed Mar 15 11:19:39 2017 -0400+ https://www.postgresql.org/message-id/attachment/50164/brin-correlation-v3.patch + fix_tbm_empty.patch
Forgot to mention after fix I am seeing this output.
postgres=# explain analyze select * from only r2 where i = 10;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
Gather (cost=2880.56..9251.98 rows=1 width=4) (actual
time=3.857..3.857 rows=0 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Bitmap Heap Scan on r2 (cost=1880.56..8251.88 rows=1
width=4) (actual time=0.043..0.043 rows=0 loops=3)
Recheck Cond: (i = 10)
-> Bitmap Index Scan on r2_i_idx (cost=0.00..1880.56
rows=373694 width=0) (actual time=0.052..0.052 rows=0 loops=1)
Index Cond: (i = 10)
Planning time: 0.111 ms
Execution time: 4.449 ms
(9 rows)
postgres=# select * from only r2 where i = 10;
i
---
(0 rows)
Are you getting the crash with the same test case?
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Are you getting the crash with the same test case?
Yes. Here is the new backtrace:
* thread #1: tid = 0x51828fd, 0x0000000100caf314 postgres`tbm_prepare_shared_iterate [inlined] pg_atomic_write_u32_impl(val=0) at generic.h:57, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
* frame #0: 0x0000000100caf314 postgres`tbm_prepare_shared_iterate [inlined] pg_atomic_write_u32_impl(val=0) at generic.h:57 [opt]
frame #1: 0x0000000100caf314 postgres`tbm_prepare_shared_iterate [inlined] pg_atomic_init_u32_impl(val_=0) at generic.h:163 [opt]
frame #2: 0x0000000100caf314 postgres`tbm_prepare_shared_iterate [inlined] pg_atomic_init_u32(val=0) + 17 at atomics.h:237 [opt]
frame #3: 0x0000000100caf303 postgres`tbm_prepare_shared_iterate(tbm=<unavailable>) + 723 at tidbitmap.c:875 [opt]
frame #4: 0x0000000100c74844 postgres`BitmapHeapNext(node=<unavailable>) + 436 at nodeBitmapHeapscan.c:154 [opt]
frame #5: 0x0000000100c615b0 postgres`ExecProcNode(node=0x00007fdabf8189f0) + 224 at execProcnode.c:459 [opt]
frame #6: 0x0000000100c76ca9 postgres`ExecGather [inlined] gather_getnext(gatherstate=<unavailable>) + 520 at nodeGather.c:276 [opt]
frame #7: 0x0000000100c76aa1 postgres`ExecGather(node=<unavailable>) + 497 at nodeGather.c:212 [opt]
frame #8: 0x0000000100c61692 postgres`ExecProcNode(node=0x00007fdabf818558) + 450 at execProcnode.c:541 [opt]
frame #9: 0x0000000100c5cf70 postgres`standard_ExecutorRun [inlined] ExecutePlan(estate=<unavailable>, planstate=<unavailable>, use_parallel_mode=<unavailable>, operation=<unavailable>, numberTuples=0, direction=<unavailable>, dest=<unavailable>) + 29 at execMain.c:1616 [opt]
frame #10: 0x0000000100c5cf53 postgres`standard_ExecutorRun(queryDesc=<unavailable>, direction=<unavailable>, count=0) + 291 at execMain.c:348 [opt]
frame #11: 0x0000000100dac0df postgres`PortalRunSelect(portal=0x00007fdac000b240, forward=<unavailable>, count=0, dest=<unavailable>) + 255 at pquery.c:921 [opt]
frame #12: 0x0000000100dabc84 postgres`PortalRun(portal=0x00007fdac000b240, count=<unavailable>, isTopLevel='\x01', dest=<unavailable>, altdest=<unavailable>, completionTag=<unavailable>) + 500 at pquery.c:762 [opt]
frame #13: 0x0000000100da989b postgres`PostgresMain + 44 at postgres.c:1101 [opt]
frame #14: 0x0000000100da986f postgres`PostgresMain(argc=<unavailable>, argv=<unavailable>, dbname=<unavailable>, username=<unavailable>) + 8927 at postgres.c:4066 [opt]
frame #15: 0x0000000100d2c113 postgres`PostmasterMain [inlined] BackendRun + 7587 at postmaster.c:4317 [opt]
frame #16: 0x0000000100d2c0e8 postgres`PostmasterMain [inlined] BackendStartup at postmaster.c:3989 [opt]
frame #17: 0x0000000100d2c0e8 postgres`PostmasterMain at postmaster.c:1729 [opt]
frame #18: 0x0000000100d2c0e8 postgres`PostmasterMain(argc=<unavailable>, argv=<unavailable>) + 7544 at postmaster.c:1337 [opt]
frame #19: 0x0000000100ca528f postgres`main(argc=<unavailable>, argv=<unavailable>) + 1567 at main.c:228 [opt]
frame #20: 0x00007fffb4e28255 libdyld.dylib`start + 1
frame #21: 0x00007fffb4e28255 libdyld.dylib`start + 1
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Mar 16, 2017 at 3:52 PM, Emre Hasegeli <emre@hasegeli.com> wrote:
* thread #1: tid = 0x51828fd, 0x0000000100caf314 postgres`tbm_prepare_shared_iterate [inlined] pg_atomic_write_u32_impl(val=0) at generic.h:57, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
* frame #0: 0x0000000100caf314 postgres`tbm_prepare_shared_iterate [inlined] pg_atomic_write_u32_impl(val=0) at generic.h:57 [opt]
frame #1: 0x0000000100caf314 postgres`tbm_prepare_shared_iterate [inlined] pg_atomic_init_u32_impl(val_=0) at generic.h:163 [opt]
frame #2: 0x0000000100caf314 postgres`tbm_prepare_shared_iterate [inlined] pg_atomic_init_u32(val=0) + 17 at atomics.h:237 [opt]
By looking at the call stack I got the problem location. I am
reviewing other parts of the code if there are the similar mistake at
other places. Soon I will post the patch. Thanks for the help.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Mar 16, 2017 at 5:14 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
pg_atomic_write_u32_impl(val=0) at generic.h:57, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
* frame #0: 0x0000000100caf314 postgres`tbm_prepare_shared_iterate [inlined] pg_atomic_write_u32_impl(val=0) at generic.h:57 [opt]
frame #1: 0x0000000100caf314 postgres`tbm_prepare_shared_iterate [inlined] pg_atomic_init_u32_impl(val_=0) at generic.h:163 [opt]
frame #2: 0x0000000100caf314 postgres`tbm_prepare_shared_iterate [inlined] pg_atomic_init_u32(val=0) + 17 at atomics.h:237 [opt]By looking at the call stack I got the problem location. I am
reviewing other parts of the code if there are the similar mistake at
other places. Soon I will post the patch. Thanks for the help.
Based on the call stack I have tried to fix the issue. The problem is
there was some uninitialized pointer access (in some special cases
i.e. TBM_EMPTY when pagetable is not created at all).
fix_tbm_empty.patch have fixed some of them but induced one which you
are seeing in your call stack.
Hopefully, this time I got it correct. Since I am unable to reproduce
the issue so I will again need your help in verifying the fix.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Attachments:
fix_tbm_empty_v2.patchapplication/octet-stream; name=fix_tbm_empty_v2.patchDownload
diff --git a/src/backend/nodes/tidbitmap.c b/src/backend/nodes/tidbitmap.c
index 8e91564..c7954e3 100644
--- a/src/backend/nodes/tidbitmap.c
+++ b/src/backend/nodes/tidbitmap.c
@@ -302,6 +302,10 @@ tbm_create(long maxbytes, dsa_area *dsa)
tbm->maxentries = (int) nbuckets;
tbm->lossify_start = 0;
tbm->dsa = dsa;
+ tbm->dsapagetable = InvalidDsaPointer;
+ tbm->dsapagetableold = InvalidDsaPointer;
+ tbm->ptpages = InvalidDsaPointer;
+ tbm->ptchunks = InvalidDsaPointer;
return tbm;
}
@@ -363,13 +367,16 @@ void
tbm_free_shared_area(dsa_area *dsa, dsa_pointer dp)
{
TBMSharedIteratorState *istate = dsa_get_address(dsa, dp);
- PTEntryArray *ptbase = dsa_get_address(dsa, istate->pagetable);
+ PTEntryArray *ptbase;
PTIterationArray *ptpages;
PTIterationArray *ptchunks;
- if (pg_atomic_sub_fetch_u32(&ptbase->refcount, 1) == 0)
- dsa_free(dsa, istate->pagetable);
-
+ if (istate->pagetable)
+ {
+ ptbase = dsa_get_address(dsa, istate->pagetable);
+ if (pg_atomic_sub_fetch_u32(&ptbase->refcount, 1) == 0)
+ dsa_free(dsa, istate->pagetable);
+ }
if (istate->spages)
{
ptpages = dsa_get_address(dsa, istate->spages);
@@ -786,7 +793,7 @@ tbm_prepare_shared_iterate(TIDBitmap *tbm)
{
dsa_pointer dp;
TBMSharedIteratorState *istate;
- PTEntryArray *ptbase;
+ PTEntryArray *ptbase = NULL;
PTIterationArray *ptpages = NULL;
PTIterationArray *ptchunks = NULL;
@@ -797,7 +804,7 @@ tbm_prepare_shared_iterate(TIDBitmap *tbm)
* Allocate TBMSharedIteratorState from DSA to hold the shared members and
* lock, this will also be used by multiple worker for shared iterate.
*/
- dp = dsa_allocate(tbm->dsa, sizeof(TBMSharedIteratorState));
+ dp = dsa_allocate0(tbm->dsa, sizeof(TBMSharedIteratorState));
istate = dsa_get_address(tbm->dsa, dp);
/*
@@ -856,7 +863,7 @@ tbm_prepare_shared_iterate(TIDBitmap *tbm)
Assert(npages == tbm->npages);
Assert(nchunks == tbm->nchunks);
}
- else
+ else if (tbm->status == TBM_ONE_PAGE)
{
/*
* In one page mode allocate the space for one pagetable entry and
@@ -868,8 +875,8 @@ tbm_prepare_shared_iterate(TIDBitmap *tbm)
ptpages->index[0] = 0;
}
- pg_atomic_init_u32(&ptbase->refcount, 0);
-
+ if (ptbase)
+ pg_atomic_init_u32(&ptbase->refcount, 0);
if (npages > 1)
qsort_arg((void *) (ptpages->index), npages, sizeof(int),
tbm_shared_comparator, (void *) ptbase->ptentry);
@@ -899,7 +906,8 @@ tbm_prepare_shared_iterate(TIDBitmap *tbm)
* increase the refcount by 1 so that while freeing the shared iterator
* we don't free pagetable and iterator array until its refcount becomes 0.
*/
- pg_atomic_add_fetch_u32(&ptbase->refcount, 1);
+ if (ptbase)
+ pg_atomic_add_fetch_u32(&ptbase->refcount, 1);
if (ptpages)
pg_atomic_add_fetch_u32(&ptpages->refcount, 1);
if (ptchunks)
@@ -1069,9 +1077,16 @@ tbm_shared_iterate(TBMSharedIterator *iterator)
{
TBMIterateResult *output = &iterator->output;
TBMSharedIteratorState *istate = iterator->state;
- PagetableEntry *ptbase = iterator->ptbase->ptentry;
- int *idxpages = iterator->ptpages->index;
- int *idxchunks = iterator->ptchunks->index;
+ PagetableEntry *ptbase;
+ int *idxpages;
+ int *idxchunks;
+
+ if (iterator->ptbase)
+ ptbase = iterator->ptbase->ptentry;
+ if (iterator->ptpages)
+ idxpages = iterator->ptpages->index;
+ if (iterator->ptchunks)
+ idxchunks = iterator->ptchunks->index;
/* Acquire the LWLock before accessing the shared members */
LWLockAcquire(&istate->lock, LW_EXCLUSIVE);
@@ -1480,7 +1495,7 @@ tbm_attach_shared_iterate(dsa_area *dsa, dsa_pointer dp)
* Create the TBMSharedIterator struct, with enough trailing space to
* serve the needs of the TBMIterateResult sub-struct.
*/
- iterator = (TBMSharedIterator *) palloc(sizeof(TBMSharedIterator) +
+ iterator = (TBMSharedIterator *) palloc0(sizeof(TBMSharedIterator) +
MAX_TUPLES_PER_PAGE * sizeof(OffsetNumber));
istate = (TBMSharedIteratorState *) dsa_get_address(dsa, dp);
Hopefully, this time I got it correct. Since I am unable to reproduce
the issue so I will again need your help in verifying the fix.
It is not crashing with the new patch. Thank you.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Mar 16, 2017 at 10:56 AM, Emre Hasegeli <emre@hasegeli.com> wrote:
Hopefully, this time I got it correct. Since I am unable to reproduce
the issue so I will again need your help in verifying the fix.It is not crashing with the new patch. Thank you.
Thanks for confirming. Some review comments on v2:
+ if (istate->pagetable)
Please compare explicitly to InvalidDsaPointer.
+ if (iterator->ptbase)
+ ptbase = iterator->ptbase->ptentry;
+ if (iterator->ptpages)
+ idxpages = iterator->ptpages->index;
+ if (iterator->ptchunks)
+ idxchunks = iterator->ptchunks->index;
Similarly.
Dilip, please also provide a proposed commit message describing what
this is fixing. Is it just the TBM_EMPTY case, or is there anything
else?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Mar 16, 2017 at 8:42 PM, Robert Haas <robertmhaas@gmail.com> wrote:
Thanks for confirming. Some review comments on v2:
+ if (istate->pagetable)
fixed
Please compare explicitly to InvalidDsaPointer.
+ if (iterator->ptbase) + ptbase = iterator->ptbase->ptentry; + if (iterator->ptpages) + idxpages = iterator->ptpages->index; + if (iterator->ptchunks) + idxchunks = iterator->ptchunks->index;Similarly.
fixed
Also fixed at
+ if (ptbase)
+ pg_atomic_init_u32(&ptbase->refcount, 0);
Dilip, please also provide a proposed commit message describing what
this is fixing. Is it just the TBM_EMPTY case, or is there anything
else?
Okay, I have added the commit message in the patch.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Attachments:
fix_tbm_empty_v3.patchapplication/octet-stream; name=fix_tbm_empty_v3.patchDownload
commit 22faaa3ebd1ba4b04db79b8a05f91f7d9f36892d
Author: dilip <dilip@localhost.localdomain>
Date: Thu Mar 16 22:56:00 2017 +0530
Patch fixes the uninitialized memory access in case of TBM_EMPTY(empty pagetable). It also
fixes the other similar instances where it can cause uninitialized memory acess.
diff --git a/src/backend/nodes/tidbitmap.c b/src/backend/nodes/tidbitmap.c
index 8e91564..ae7a913 100644
--- a/src/backend/nodes/tidbitmap.c
+++ b/src/backend/nodes/tidbitmap.c
@@ -302,6 +302,10 @@ tbm_create(long maxbytes, dsa_area *dsa)
tbm->maxentries = (int) nbuckets;
tbm->lossify_start = 0;
tbm->dsa = dsa;
+ tbm->dsapagetable = InvalidDsaPointer;
+ tbm->dsapagetableold = InvalidDsaPointer;
+ tbm->ptpages = InvalidDsaPointer;
+ tbm->ptchunks = InvalidDsaPointer;
return tbm;
}
@@ -363,20 +367,23 @@ void
tbm_free_shared_area(dsa_area *dsa, dsa_pointer dp)
{
TBMSharedIteratorState *istate = dsa_get_address(dsa, dp);
- PTEntryArray *ptbase = dsa_get_address(dsa, istate->pagetable);
+ PTEntryArray *ptbase;
PTIterationArray *ptpages;
PTIterationArray *ptchunks;
- if (pg_atomic_sub_fetch_u32(&ptbase->refcount, 1) == 0)
- dsa_free(dsa, istate->pagetable);
-
- if (istate->spages)
+ if (DsaPointerIsValid(istate->pagetable))
+ {
+ ptbase = dsa_get_address(dsa, istate->pagetable);
+ if (pg_atomic_sub_fetch_u32(&ptbase->refcount, 1) == 0)
+ dsa_free(dsa, istate->pagetable);
+ }
+ if (DsaPointerIsValid(istate->spages))
{
ptpages = dsa_get_address(dsa, istate->spages);
if (pg_atomic_sub_fetch_u32(&ptpages->refcount, 1) == 0)
dsa_free(dsa, istate->spages);
}
- if (istate->schunks)
+ if (DsaPointerIsValid(istate->schunks))
{
ptchunks = dsa_get_address(dsa, istate->schunks);
if (pg_atomic_sub_fetch_u32(&ptchunks->refcount, 1) == 0)
@@ -786,7 +793,7 @@ tbm_prepare_shared_iterate(TIDBitmap *tbm)
{
dsa_pointer dp;
TBMSharedIteratorState *istate;
- PTEntryArray *ptbase;
+ PTEntryArray *ptbase = NULL;
PTIterationArray *ptpages = NULL;
PTIterationArray *ptchunks = NULL;
@@ -797,7 +804,7 @@ tbm_prepare_shared_iterate(TIDBitmap *tbm)
* Allocate TBMSharedIteratorState from DSA to hold the shared members and
* lock, this will also be used by multiple worker for shared iterate.
*/
- dp = dsa_allocate(tbm->dsa, sizeof(TBMSharedIteratorState));
+ dp = dsa_allocate0(tbm->dsa, sizeof(TBMSharedIteratorState));
istate = dsa_get_address(tbm->dsa, dp);
/*
@@ -856,7 +863,7 @@ tbm_prepare_shared_iterate(TIDBitmap *tbm)
Assert(npages == tbm->npages);
Assert(nchunks == tbm->nchunks);
}
- else
+ else if (tbm->status == TBM_ONE_PAGE)
{
/*
* In one page mode allocate the space for one pagetable entry and
@@ -868,8 +875,8 @@ tbm_prepare_shared_iterate(TIDBitmap *tbm)
ptpages->index[0] = 0;
}
- pg_atomic_init_u32(&ptbase->refcount, 0);
-
+ if (ptbase != NULL)
+ pg_atomic_init_u32(&ptbase->refcount, 0);
if (npages > 1)
qsort_arg((void *) (ptpages->index), npages, sizeof(int),
tbm_shared_comparator, (void *) ptbase->ptentry);
@@ -899,10 +906,11 @@ tbm_prepare_shared_iterate(TIDBitmap *tbm)
* increase the refcount by 1 so that while freeing the shared iterator
* we don't free pagetable and iterator array until its refcount becomes 0.
*/
- pg_atomic_add_fetch_u32(&ptbase->refcount, 1);
- if (ptpages)
+ if (ptbase != NULL)
+ pg_atomic_add_fetch_u32(&ptbase->refcount, 1);
+ if (ptpages != NULL)
pg_atomic_add_fetch_u32(&ptpages->refcount, 1);
- if (ptchunks)
+ if (ptchunks != NULL)
pg_atomic_add_fetch_u32(&ptchunks->refcount, 1);
/* Initialize the iterator lock */
@@ -1069,9 +1077,16 @@ tbm_shared_iterate(TBMSharedIterator *iterator)
{
TBMIterateResult *output = &iterator->output;
TBMSharedIteratorState *istate = iterator->state;
- PagetableEntry *ptbase = iterator->ptbase->ptentry;
- int *idxpages = iterator->ptpages->index;
- int *idxchunks = iterator->ptchunks->index;
+ PagetableEntry *ptbase = NULL;
+ int *idxpages = NULL;
+ int *idxchunks = NULL;
+
+ if (iterator->ptbase != NULL)
+ ptbase = iterator->ptbase->ptentry;
+ if (iterator->ptpages != NULL)
+ idxpages = iterator->ptpages->index;
+ if (iterator->ptchunks != NULL)
+ idxchunks = iterator->ptchunks->index;
/* Acquire the LWLock before accessing the shared members */
LWLockAcquire(&istate->lock, LW_EXCLUSIVE);
@@ -1480,7 +1495,7 @@ tbm_attach_shared_iterate(dsa_area *dsa, dsa_pointer dp)
* Create the TBMSharedIterator struct, with enough trailing space to
* serve the needs of the TBMIterateResult sub-struct.
*/
- iterator = (TBMSharedIterator *) palloc(sizeof(TBMSharedIterator) +
+ iterator = (TBMSharedIterator *) palloc0(sizeof(TBMSharedIterator) +
MAX_TUPLES_PER_PAGE * sizeof(OffsetNumber));
istate = (TBMSharedIteratorState *) dsa_get_address(dsa, dp);
On Thu, Mar 16, 2017 at 8:26 PM, Emre Hasegeli <emre@hasegeli.com> wrote:
Hopefully, this time I got it correct. Since I am unable to reproduce
the issue so I will again need your help in verifying the fix.It is not crashing with the new patch. Thank you.
Thanks for verifying.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Mar 16, 2017 at 1:50 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
fixed
Committed.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers