OOM on EXPLAIN with lots of nodes

Started by Alexey Bashtanovover 11 years ago12 messageshackers

bashtanov@imap.cc

over 11 years ago

Hello!

I found that EXPLAIN command takes very much memory to execute when huge
unions are used.
For example the following sql
-- begin sql
create table t (a000 int, a001 int, ... a099 int);
explain select * from (
select a001 a from t
union all
select a001 a from t
union all
... (1000 times) ...
union all
select a001 a from t
) _ where a = 1;
-- end sql
took more than 1GB of memory to execute.

Namely converting of the plan to a human-readable form causes excessive
memory usage, not planning itself.

By varying the parameters and reading source code I determined that
memory usage linearly depends on (plan nodes count)*(overall columns
count), thus it quadratically depends on number of tables unionized.

To remove this excessive memory usage I propose
to run deparse_context_for_planstate+deparse_expression in a separate
memory context and free it after a plan node is generated.

Any reasons to treat this idea as bad?

BTW in this case explain execution is also quite long (I got tens of
seconds). But I have no immediate ideas how to improve it.

Regards,
Alexey Bashtanov

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

over 11 years ago

In reply to: Alexey Bashtanov (#1)

Re: OOM on EXPLAIN with lots of nodes

On 01/13/2015 02:08 PM, Alexey Bashtanov wrote:

I found that EXPLAIN command takes very much memory to execute when huge
unions are used.
For example the following sql
-- begin sql
create table t (a000 int, a001 int, ... a099 int);
explain select * from (
select a001 a from t
union all
select a001 a from t
union all
... (1000 times) ...
union all
select a001 a from t
) _ where a = 1;
-- end sql
took more than 1GB of memory to execute.

Namely converting of the plan to a human-readable form causes excessive
memory usage, not planning itself.

By varying the parameters and reading source code I determined that
memory usage linearly depends on (plan nodes count)*(overall columns
count), thus it quadratically depends on number of tables unionized.

To remove this excessive memory usage I propose
to run deparse_context_for_planstate+deparse_expression in a separate
memory context and free it after a plan node is generated.

Hmm, something like the attached? Seems reasonable...

- Heikki

Alexey Bashtanov

bashtanov@imap.cc

over 11 years ago

In reply to: Heikki Linnakangas (#2)

Re: OOM on EXPLAIN with lots of nodes

On 13.01.2015 16:47, Heikki Linnakangas wrote:

Hmm, something like the attached? Seems reasonable...

- Heikki

yes, i have tested something like this, it stopped eating memory

Just one small notice to the patch you attached: maybe it would be more
safe to switch to oldcxt before calling
ExplainPropertyText/ExplainPropertyList?
Otherwise are you pretty sure ExplainPropertyText and
ExplainPropertyList are not going perform any pallocs without immediate
pfrees in current memory context even in future?

Regards,
Alexey Bashtanov

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tom Lane

tgl@sss.pgh.pa.us

over 11 years ago

In reply to: Heikki Linnakangas (#2)

Re: OOM on EXPLAIN with lots of nodes

Heikki Linnakangas <hlinnakangas@vmware.com> writes:

On 01/13/2015 02:08 PM, Alexey Bashtanov wrote:

By varying the parameters and reading source code I determined that
memory usage linearly depends on (plan nodes count)*(overall columns
count), thus it quadratically depends on number of tables unionized.

To remove this excessive memory usage I propose
to run deparse_context_for_planstate+deparse_expression in a separate
memory context and free it after a plan node is generated.

Hmm, something like the attached? Seems reasonable...

This looks pretty unsafe to me: it assumes, without much justification,
that there is no memory allocated during show_expression() that will be
needed later.

I suspect the O(N^2) consumption comes directly from some workspace
allocated for variable deparsing in ruleutils.c. A safer fix would
be to do retail pfrees on that.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tom Lane

tgl@sss.pgh.pa.us

over 11 years ago

In reply to: Tom Lane (#4)

Re: OOM on EXPLAIN with lots of nodes

I wrote:

Heikki Linnakangas <hlinnakangas@vmware.com> writes:

Hmm, something like the attached? Seems reasonable...

This looks pretty unsafe to me: it assumes, without much justification,
that there is no memory allocated during show_expression() that will be
needed later.

I suspect the O(N^2) consumption comes directly from some workspace
allocated for variable deparsing in ruleutils.c. A safer fix would
be to do retail pfrees on that.

I looked at this a bit more, and realized that we've got worse problems
than just O(N^2) space consumption: there's also O(N^2) time consumption
in ruleutils' set_simple_column_names(). The constant factor is not too
bad in this example but would be horrible with 1000 regular relations in
the rtable :-(. Safe or not, the extra-context solution doesn't improve
the code's time consumption.

Really the issue here is that explain.c assumes that
deparse_context_for_planstate() is negligibly cheap, which was once true
but is so no longer. What's more, most of the work is dependent only on
the rtable and so there is no good reason at all to be doing it more than
once per plan tree.

I think we can fix this by refactoring so that we construct a deparse
context once in ExplainPrintPlan(), and then just repoint it at
different planstate nodes as we work through the plan tree.

A difficulty with either your patch or my idea is that they require adding
another field to ExplainState, which is an ABI break for any third-party
code that might be declaring variables of that struct type. That's fine
for HEAD but would be risky to back-patch. Any thoughts about whether we
can get away with that (ie, anybody have an idea if there are third-party
extensions that call explain.c)?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Alvaro Herrera

alvherre@2ndquadrant.com

over 11 years ago

In reply to: Tom Lane (#5)

Re: OOM on EXPLAIN with lots of nodes

Tom Lane wrote:

A difficulty with either your patch or my idea is that they require adding
another field to ExplainState, which is an ABI break for any third-party
code that might be declaring variables of that struct type. That's fine
for HEAD but would be risky to back-patch. Any thoughts about whether we
can get away with that (ie, anybody have an idea if there are third-party
extensions that call explain.c)?

codesearch.debian.net shows a couple of hits for ExplainState in
multicorn (an extension for FDW from Python data sources); I didn't look
but it seems that the FDW API is using that struct.

--
ï¿½lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tom Lane

tgl@sss.pgh.pa.us

over 11 years ago

In reply to: Alvaro Herrera (#6)

Re: OOM on EXPLAIN with lots of nodes

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

Tom Lane wrote:

A difficulty with either your patch or my idea is that they require adding
another field to ExplainState, which is an ABI break for any third-party
code that might be declaring variables of that struct type. That's fine
for HEAD but would be risky to back-patch. Any thoughts about whether we
can get away with that (ie, anybody have an idea if there are third-party
extensions that call explain.c)?

codesearch.debian.net shows a couple of hits for ExplainState in
multicorn (an extension for FDW from Python data sources); I didn't look
but it seems that the FDW API is using that struct.

It is, but FDWs are not at risk here: they merely reference ExplainStates
that were allocated by core backend code. So as long as we add the new
field at the end it's not a problem for them. Problematic usage would be
like what auto_explain does:

ExplainState es;

ExplainInitState(&es);
...

In hindsight, that's a bad API and we should change it to something like

ExplainState *es = NewExplainState();

so that the sizeof the struct isn't embedded in extension code. But we
definitely can't do that in back branches.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

over 11 years ago

In reply to: Tom Lane (#7)

Re: OOM on EXPLAIN with lots of nodes

On 01/13/2015 07:24 PM, Tom Lane wrote:

It is, but FDWs are not at risk here: they merely reference ExplainStates
that were allocated by core backend code. So as long as we add the new
field at the end it's not a problem for them. Problematic usage would be
like what auto_explain does:

ExplainState es;

ExplainInitState(&es);
...

In hindsight, that's a bad API and we should change it to something like

ExplainState *es = NewExplainState();

so that the sizeof the struct isn't embedded in extension code. But we
definitely can't do that in back branches.

Actually, it would make sense to do exactly that, to break any
extensions that are doing the unsafe thing in an obvious way. The
downside would be that an extension using the new API would then not
work on an old server.

We could repurpose one of the existing fields in ExplainState to point
to another struct that contains more fields. Something like this:

*** a/src/include/commands/explain.h
--- b/src/include/commands/explain.h
***************
*** 40,48 ****
   	List	   *rtable;			/* range table */
   	List	   *rtable_names;	/* alias names for RTEs */
   	int			indent;			/* current indentation level */
! 	List	   *grouping_stack; /* format-specific grouping state */
   } ExplainState;

   /* Hook for plugins to get control in ExplainOneQuery() */
   typedef void (*ExplainOneQuery_hook_type) (Query *query,
   													   IntoClause *into,
--- 40,55 ----
   	List	   *rtable;			/* range table */
   	List	   *rtable_names;	/* alias names for RTEs */
   	int			indent;			/* current indentation level */
! 	ExplainStateExtraFields *extra;	/* more fields */
   } ExplainState;

+ typedef struct ExplainStateExtraFields
+ {
+ 	List	   *grouping_stack; /* format-specific grouping state */
+ 	...
+ }
+

That's pretty ugly, but it would work even if there are ExplainState
structs embeded in extensions. As long as they don't try to look at the
grouping_stack field; I think that's fairly safe assumption.

But do we really need to backpatch any of this?

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tom Lane

tgl@sss.pgh.pa.us

over 11 years ago

In reply to: Heikki Linnakangas (#8)

Re: OOM on EXPLAIN with lots of nodes

Heikki Linnakangas <hlinnakangas@vmware.com> writes:

On 01/13/2015 07:24 PM, Tom Lane wrote:

In hindsight, that's a bad API and we should change it to something like
ExplainState *es = NewExplainState();
so that the sizeof the struct isn't embedded in extension code. But we
definitely can't do that in back branches.

Actually, it would make sense to do exactly that, to break any
extensions that are doing the unsafe thing in an obvious way. The
downside would be that an extension using the new API would then not
work on an old server.

I guess that's a possibility ...

We could repurpose one of the existing fields in ExplainState to point
to another struct that contains more fields. Something like this:
...
That's pretty ugly, but it would work even if there are ExplainState
structs embeded in extensions. As long as they don't try to look at the
grouping_stack field; I think that's fairly safe assumption.

Yeah, I was thinking the same thing, but it's *mighty* ugly and would also
create a back-patch hazard, since presumably we'd not do that in HEAD.

But do we really need to backpatch any of this?

Alexey's example consumes only a couple hundred MB in 9.2, vs about 7GB
peak in 9.3 and up. That seems like a pretty nasty regression.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10

Tom Lane

tgl@sss.pgh.pa.us

over 11 years ago

In reply to: Tom Lane (#9)

Re: OOM on EXPLAIN with lots of nodes

I wrote:

Heikki Linnakangas <hlinnakangas@vmware.com> writes:

But do we really need to backpatch any of this?

Alexey's example consumes only a couple hundred MB in 9.2, vs about 7GB
peak in 9.3 and up. That seems like a pretty nasty regression.

I did a bit more measurement of the time and backend memory consumption
for Alexey's example EXPLAIN:

9.2: 0.9 sec, circa 200 MB
HEAD: 56 sec, circa 7300 MB
with patch below: 3.7 sec, circa 300 MB

So while this doesn't get us all the way back down to where we were before
we started trying to guarantee unique table/column identifiers in EXPLAIN
printouts, it's at least a lot closer.

Not sure whether to just commit this to HEAD and call it a day, or to
risk back-patching.

It occurred to me that we could use your idea of a secondary data
structure and minimize the code impact with a macro layer, ie
#define grouping_stack pointer_field->groupingstack
But I'm not sure if it's worth the trouble. 9.3 has been like this
right along, and this is the first complaint.

One compromise idea would be to back-patch only as far as 9.4.
If there are extensions out there that have this ABI issue, expecting
them to rebuild for 9.4.1 might be OK.

regards, tom lane

#11

Robert Haas

robertmhaas@gmail.com

over 11 years ago

In reply to: Tom Lane (#10)

Re: OOM on EXPLAIN with lots of nodes

On Tue, Jan 13, 2015 at 8:16 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wrote:

Heikki Linnakangas <hlinnakangas@vmware.com> writes:

But do we really need to backpatch any of this?

Alexey's example consumes only a couple hundred MB in 9.2, vs about 7GB
peak in 9.3 and up. That seems like a pretty nasty regression.

I did a bit more measurement of the time and backend memory consumption
for Alexey's example EXPLAIN:

9.2: 0.9 sec, circa 200 MB
HEAD: 56 sec, circa 7300 MB
with patch below: 3.7 sec, circa 300 MB

So while this doesn't get us all the way back down to where we were before
we started trying to guarantee unique table/column identifiers in EXPLAIN
printouts, it's at least a lot closer.

Not sure whether to just commit this to HEAD and call it a day, or to
risk back-patching.

I think we need to back-patch something; that's a pretty nasty
regression, and I have some EDB-internal reports that might be from
the same cause. I'm not too concerned about forcibly breaking the API
here, but I can understand why somebody might want to do that. If we
do, I like the idea of renaming ExplainInitState() or maybe by
replacing it by a NewExplainState() function that is used instead.
But I'm not sure how necessary it is really.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12

Tom Lane

tgl@sss.pgh.pa.us

over 11 years ago

In reply to: Robert Haas (#11)

Re: OOM on EXPLAIN with lots of nodes

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Jan 13, 2015 at 8:16 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Not sure whether to just commit this to HEAD and call it a day, or to
risk back-patching.

I think we need to back-patch something; that's a pretty nasty
regression, and I have some EDB-internal reports that might be from
the same cause.

OK. I went ahead and made the necessary code changes to avoid changing
the struct size in released branches, as per Heikki's idea.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

OOM on EXPLAIN with lots of nodes

Attachments:

Attachments: