IndexJoin memory problem using spgist and boxes

Started by Anton Dignösover 8 years ago10 messageshackers

dignoes@inf.unibz.it

over 8 years ago

Hi,

I came across a strange memory problem when doing an IndexJoin using
spgist on boxes.
I also found it mentioned here:
/messages/by-id/CAPqRbE5vTGWCGrOc91Bmu-0o7CwsU0UCnAshOtpDR8cSpSjy0g@mail.gmail.com

With the following setting:

CREATE TABLE r AS SELECT 1 i, box(point(generate_series,
generate_series), point(generate_series+10, generate_series+10)) FROM
generate_series(1, 1000000);
CREATE TABLE s AS SELECT 1 i, box(point(generate_series,
generate_series), point(generate_series+10, generate_series+10)) FROM
generate_series(1, 1000000) ORDER BY random(); -- random sort just
speeds up index creation
CREATE INDEX s_idx ON s USING spgist(box);

postgres consumes several GBs of main memory for the following query:

SELECT * FROM r, s WHERE r.box && s.box;

The problem also occurs for polygons which are based on boxes and are
now part of the dev version.

The attached patch should fix this problem by maintaining the
traversal memory per index scan instead of for the entire join.

Best regards,
Anton

Alexander Kuzmenkov

a.kuzmenkov@postgrespro.ru

over 8 years ago

In reply to: Anton Dignös (#1)

Re: IndexJoin memory problem using spgist and boxes

Hi Anton,

I can reproduce the high memory consumption with your queries.

Looking at the patch, I see that you changed the lifetime of the
temporary context from per-tuple to per-index-scan. It is not obvious
that this change is correct. Could you explain, what memory context are
involved in the scan, and what their lifetimes are, before and after
your changes? What are these memory allocations that are causing the
high consumption, and what code makes them? This will make it easier to
understand your changes.

/messages/by-id/CAPqRbE5vTGWCGrOc91Bmu-0o7CwsU0UCnAshOtpDR8cSpSjy0g@mail.gmail.com

Also, this link doesn't open for me.

--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Anton Dignös

dignoes@inf.unibz.it

over 8 years ago

In reply to: Alexander Kuzmenkov (#2)

Re: IndexJoin memory problem using spgist and boxes

Hi Alexander,

thanks for the feedback.

I can reproduce the high memory consumption with your queries.

Looking at the patch, I see that you changed the lifetime of the temporary
context from per-tuple to per-index-scan. It is not obvious that this change
is correct. Could you explain, what memory context are involved in the scan,
and what their lifetimes are, before and after your changes? What are these
memory allocations that are causing the high consumption, and what code
makes them? This will make it easier to understand your changes.

Yes, you are right, I changed the temporary context for calling the
user-defined inner consistent method from per-tuple to per-index-scan.
For boxes this function is spg_box_quad_inner_consistent in
geo_spgist.c that allocates nodes, RangeBoxes, and RectBoxes.

The problem before this patch was that the traversalMemoryContext in
this function was set to per-query lifetime.
The memory allocations in the per-query lifetime caused this high
memory consumption.

I changed the temporary context to per-index-scan so that it can also
be used for traversalMemoryContext.
Calling the function in per-index-scan lifetime did cause a high
memory consumption.

The better alternative may be to have two temporary memory contexts,
one per-tuple for calling the inner consistent method and one
per-index-scan for the traversal memory.

What do you think?

/messages/by-id/CAPqRbE5vTGWCGrOc91Bmu-0o7CwsU0UCnAshOtpDR8cSpSjy0g@mail.gmail.com

Also, this link doesn't open for me.

The link works perfectly fine for me.

Best regards,
Anton

Tom Lane

tgl@sss.pgh.pa.us

over 8 years ago

In reply to: Anton Dignös (#3)

Re: IndexJoin memory problem using spgist and boxes

=?UTF-8?Q?Anton_Dign=C3=B6s?= <dignoes@inf.unibz.it> writes:

Looking at the patch, I see that you changed the lifetime of the temporary
context from per-tuple to per-index-scan. It is not obvious that this change
is correct.

The problem before this patch was that the traversalMemoryContext in
this function was set to per-query lifetime.
The memory allocations in the per-query lifetime caused this high
memory consumption.

Yeah ...

I changed the temporary context to per-index-scan so that it can also
be used for traversalMemoryContext.

But we have also had many complaints about leakage across a single index
scan, if it traverses many index entries. I think you're just moving the
pain out of one use-case and into another.

regards, tom lane

Alexander Kuzmenkov

a.kuzmenkov@postgrespro.ru

over 8 years ago

In reply to: Anton Dignös (#3)

Re: IndexJoin memory problem using spgist and boxes

On 04.03.2018 20:20, Anton Dignös wrote:

The better alternative may be to have two temporary memory contexts,
one per-tuple for calling the inner consistent method and one
per-index-scan for the traversal memory.

Yes, this seems to be a better way of fixing the problem without
introducing regressions mentioned by Tom. We'd keep a separate traversal
context in ScanOpaque and run most of the spgWalk in it, except calling
storeRes in query context and the inner consistent method in short-lived
context.

Also, I think it would be worthwhile to test the resulting patch with
valgrind. The allocations are not very apparent in the code, so it's
easy to miss something.

--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Anton Dignös

dignoes@inf.unibz.it

over 8 years ago

In reply to: Alexander Kuzmenkov (#5)

Re: IndexJoin memory problem using spgist and boxes

The better alternative may be to have two temporary memory contexts,
one per-tuple for calling the inner consistent method and one
per-index-scan for the traversal memory.

Yes, this seems to be a better way of fixing the problem without introducing
regressions mentioned by Tom. We'd keep a separate traversal context in
ScanOpaque and run most of the spgWalk in it, except calling storeRes in
query context and the inner consistent method in short-lived context.

Thanks to both for the feedback.
I will work on that and come back to you.

Also, I think it would be worthwhile to test the resulting patch with
valgrind. The allocations are not very apparent in the code, so it's easy to
miss something.

I tried with valgrind in the first place and didn't see any suspicious
memory leaks but I will give it another try.

Best regards,
Anton

Anton Dignös

dignoes@inf.unibz.it

over 8 years ago

In reply to: Anton Dignös (#6)

Re: IndexJoin memory problem using spgist and boxes

Hi,

attached is the patch that uses two memory contexts.
One for calling the inner consistent function,
and a new one for keeping the traversal memory of the inner consistent function.

I run some test to compare the memory footprints. I report the total maximum
memory usage (sum of all children) of the per-query memory context
that is the parent of all memory
contexts used. The test datasets are described below.

TEST | HEAD | previous patch | patch V2
------+-------+----------------+----------
T1 | 3.4GB | 98kB | 81kB
T2 | 3.4GB | 17MB | 17MB
T3 | 3.5GB | 17MB | 17MB
T4 | 7GB | 17MB | 17MB
T5 | 8GB | 34MB | 25MB

T1: 1M x 1M tuples with relatively few overlaps
T2: as T1, but with 1 tuple in the outer relation for which the entire
index reports matches
T3: as T1, but with 10 tuples in the outer relation for which the
entire index reports matches
T4: as T3, but the outer relation has double the number of tuples
T5: as T4, but the inner relation has double the number of tuples

TEST dataset creation queries (executed incrementally)

T1:
-- create table r with 1M tuples
CREATE TABLE r AS SELECT 1 i, box(point(generate_series,
generate_series), point(generate_series+10, generate_series+10)) FROM
generate_series(1, 1000000);
-- create table s with 1M tuples
CREATE TABLE s AS SELECT 1 i, box(point(generate_series,
generate_series), point(generate_series+10, generate_series+10)) FROM
generate_series(1, 1000000) ORDER BY random(); -- random sort just
speeds up index creation
CREATE INDEX s_idx ON s USING spgist(box);

T2:
-- inserts a tuple for which the entire index is a match
INSERT INTO r VALUES (2, box(point(1, 1), point(1000000, 1000000)));

T3:
-- inserts 9 more tuples as in T2.

T4:
-- doubles the outer relations
INSERT INTO r SELECT * FROM r;

T5:
-- doubles the inner relation
INSERT INTO s SELECT * FROM s;

The new patch is a bit better in terms of memory by using the two
memory contexts.
I also checked the query times using explain analyze, both patches
have approximately the same runtime,
but run 5-6 times faster than HEAD.

Best regards,
Anton

Alexander Kuzmenkov

a.kuzmenkov@postgrespro.ru

over 8 years ago

In reply to: Anton Dignös (#7)

Re: IndexJoin memory problem using spgist and boxes

The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: not tested
Spec compliant: not tested
Documentation: not tested

The updated version looks good to me. make installcheck under valgrind finds no errors. I also tried the T1 and T5 datasets, here are the timings and memory usage:
T1 T5
Vanilla 2G/24.2 s over 7G/didn't wait
Patch v2 144M/23.0 s 148M/108 s

The new status of this patch is: Ready for Committer

Tom Lane

tgl@sss.pgh.pa.us

over 8 years ago

In reply to: Alexander Kuzmenkov (#8)

Re: IndexJoin memory problem using spgist and boxes

Alexander Kuzmenkov <a.kuzmenkov@postgrespro.ru> writes:

The updated version looks good to me.

LGTM too. Pushed with some trivial cosmetic adjustments.

regards, tom lane

#10

Anton Dignös

anton.dignoes@gmail.com

over 8 years ago

In reply to: Tom Lane (#9)

Re: IndexJoin memory problem using spgist and boxes

On Tue, Mar 20, 2018 at 5:01 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Alexander Kuzmenkov <a.kuzmenkov@postgrespro.ru> writes:

The updated version looks good to me.

LGTM too. Pushed with some trivial cosmetic adjustments.

regards, tom lane

Thank you both!

Best regards,
Anton

IndexJoin memory problem using spgist and boxes

Attachments:

Attachments: