Patch for ginCombineData

Started by Robert Abrahamover 10 years ago4 messages
#1Robert Abraham
robert.abraham86@googlemail.com
1 attachment(s)

Hello,

we are using gin indexes on big tables. these tables happen to have several
billion rows.
the index creation fails in ginCombineData in src/backend/access/ginbulk.c
because repalloc is limited to 1 gb.
this limitation makes no sense in this context (compare comments in
src/include/utils/memutils.h).
To overcome this limitation on tables with lots of rows repalloc_huge is
used.
The test suite still succeeds on my machine.
Find the patch attached,

Kind regards,

Robert Abraham

Attachments:

gin_combiner_repalloc_huge.patchtext/x-patch; charset=US-ASCII; name=gin_combiner_repalloc_huge.patchDownload
*** a/src/backend/access/gin/ginbulk.c
--- b/src/backend/access/gin/ginbulk.c
***************
*** 39,45 **** ginCombineData(RBNode *existing, const RBNode *newdata, void *arg)
  		accum->allocatedMemory -= GetMemoryChunkSpace(eo->list);
  		eo->maxcount *= 2;
  		eo->list = (ItemPointerData *)
! 			repalloc(eo->list, sizeof(ItemPointerData) * eo->maxcount);
  		accum->allocatedMemory += GetMemoryChunkSpace(eo->list);
  	}
  
--- 39,45 ----
  		accum->allocatedMemory -= GetMemoryChunkSpace(eo->list);
  		eo->maxcount *= 2;
  		eo->list = (ItemPointerData *)
! 			repalloc_huge(eo->list, sizeof(ItemPointerData) * eo->maxcount);
  		accum->allocatedMemory += GetMemoryChunkSpace(eo->list);
  	}
  
#2Alexander Korotkov
a.korotkov@postgrespro.ru
In reply to: Robert Abraham (#1)
Re: Patch for ginCombineData

Hi!

On Wed, Aug 5, 2015 at 1:17 PM, Robert Abraham <
robert.abraham86@googlemail.com> wrote:

we are using gin indexes on big tables. these tables happen to have
several billion rows.
the index creation fails in ginCombineData in src/backend/access/ginbulk.c
because repalloc is limited to 1 gb.
this limitation makes no sense in this context (compare comments in
src/include/utils/memutils.h).
To overcome this limitation on tables with lots of rows repalloc_huge is
used.
The test suite still succeeds on my machine.
Find the patch attached,

Thank you for notice and for the patch!
You should have maintenance_work_mem > 1gb and some very frequent entry so
that it's posting list exceeds 1 gb itself.
These circumstances shouldn't be very rare in modern systems. I think it
could be backpatched.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#3Robert Haas
robertmhaas@gmail.com
In reply to: Robert Abraham (#1)
Re: Patch for ginCombineData

On Wed, Aug 5, 2015 at 6:17 AM, Robert Abraham
<robert.abraham86@googlemail.com> wrote:

we are using gin indexes on big tables. these tables happen to have several
billion rows.
the index creation fails in ginCombineData in src/backend/access/ginbulk.c
because repalloc is limited to 1 gb.
this limitation makes no sense in this context (compare comments in
src/include/utils/memutils.h).
To overcome this limitation on tables with lots of rows repalloc_huge is
used.
The test suite still succeeds on my machine.
Find the patch attached,

Since nobody seems ready to act on this patch immediately, I suggest
adding it here so it doesn't get forgotten:

https://commitfest.postgresql.org/action/commitfest_view/open

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Jeff Janes
jeff.janes@gmail.com
In reply to: Robert Abraham (#1)
Re: Patch for ginCombineData

On Wed, Aug 5, 2015 at 3:17 AM, Robert Abraham <
robert.abraham86@googlemail.com> wrote:

Hello,

we are using gin indexes on big tables. these tables happen to have
several billion rows.
the index creation fails in ginCombineData in src/backend/access/ginbulk.c
because repalloc is limited to 1 gb.
this limitation makes no sense in this context (compare comments in
src/include/utils/memutils.h).
To overcome this limitation on tables with lots of rows repalloc_huge is
used.
The test suite still succeeds on my machine.
Find the patch attached,

This looks good to me.

One possible issue I see is that if accum.allocatedMemory is only slightly
less than maintenance_work_mem just before we decide to repalloc, then
doing the repalloc_huge could cause us to exceed maintenance_work_mem.
Potentially by a lot. In the worst case (when all the data we've seen
during this round of accumulation falls into the same key), we would
overrun by a factor of almost 2. Or almost 3, if you count the time during
the repalloc when the new data has been allocated but the old data not yet
freed. Perhaps the code here should look at the amount of
maintenance_work_mem left and grow by less than a factor of 2 if it is too
close to going over.

Mitigating that, it won't actually use very much of that memory. Once
control passes back at the end of this tuple, the caller will then realize
it has overshot the maintenance_work_mem and will flush it all. I think
most modern OSes will not have a problems caused by allocated but untouched
memory.

This patch doesn't introduce that problem, it just allows it to operate at
a higher absolute size. So I'm marking this as ready for committer.

Cheers,

Jeff