Don't deform column-by-column in composite_to_json

Started by Andres Freundabout 7 years ago5 messageshackers
Jump to latest
#1Andres Freund
andres@anarazel.de

Hi,

In /messages/by-id/20190201162404.onngi77f26baem4g@alap3.anarazel.de
I noticed that composite_to_json() deforms column-by-column. Given that
it always processes all columns, that seems quite the waste of resources.

In some quick'n dirty dirty testing this gives a ~4% benefit in a table
without nulls and varlenas, and ~7% in one with both. I assume that if
one were to test with a bit wider table the win would be bigger.

A short test shows that it'd be slower to allocate nulls/values with
palloc rather than using MaxHeapAttributeNumber. Given that only output
functions are called from within composite_to_json(), I think that's ok.

Greetings,

Andres Freund

Attachments:

faster_composite_to_json.difftext/x-diff; charset=us-asciiDownload+7-6
#2Daniel Gustafsson
daniel@yesql.se
In reply to: Andres Freund (#1)
Re: Don't deform column-by-column in composite_to_json

On 2 Feb 2019, at 00:21, Andres Freund <andres@anarazel.de> wrote:

In /messages/by-id/20190201162404.onngi77f26baem4g@alap3.anarazel.de
I noticed that composite_to_json() deforms column-by-column. Given that
it always processes all columns, that seems quite the waste of resources.

In some quick'n dirty dirty testing this gives a ~4% benefit in a table
without nulls and varlenas, and ~7% in one with both. I assume that if
one were to test with a bit wider table the win would be bigger.

A short test shows that it'd be slower to allocate nulls/values with
palloc rather than using MaxHeapAttributeNumber. Given that only output
functions are called from within composite_to_json(), I think that's ok.

Nice catch, patch looks good to me. composite_to_jsonb() has the same
construction, processing every attribute. Should it get a similar patch as
this?

cheers ./daniel

#3Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Andres Freund (#1)
Re: Don't deform column-by-column in composite_to_json

On 2019-Feb-01, Andres Freund wrote:

diff --git a/src/backend/utils/adt/json.c b/src/backend/utils/adt/json.c
index de0d0723b71..8724022df54 100644
--- a/src/backend/utils/adt/json.c
+++ b/src/backend/utils/adt/json.c
@@ -1755,6 +1755,8 @@ composite_to_json(Datum composite, StringInfo result, bool use_line_feeds)
int			i;
bool		needsep = false;
const char *sep;
+	Datum		values[MaxHeapAttributeNumber];
+	bool		nulls[MaxHeapAttributeNumber];

sep = use_line_feeds ? ",\n " : ",";

Isn't this putting much more than needed in the stack? Seems like we
could just allocate tupdesc->natts members dynamically. Not sure if we
care: it's about 12 kB; maybe considering palloc overhead, using the
stack is better.

Worth asking. But if this is worth doing here, then it's worth doing in
a lot more places, isn't it?

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#4Andres Freund
andres@anarazel.de
In reply to: Alvaro Herrera (#3)
Re: Don't deform column-by-column in composite_to_json

On 2019-02-05 22:53:37 -0300, Alvaro Herrera wrote:

On 2019-Feb-01, Andres Freund wrote:

diff --git a/src/backend/utils/adt/json.c b/src/backend/utils/adt/json.c
index de0d0723b71..8724022df54 100644
--- a/src/backend/utils/adt/json.c
+++ b/src/backend/utils/adt/json.c
@@ -1755,6 +1755,8 @@ composite_to_json(Datum composite, StringInfo result, bool use_line_feeds)
int			i;
bool		needsep = false;
const char *sep;
+	Datum		values[MaxHeapAttributeNumber];
+	bool		nulls[MaxHeapAttributeNumber];

sep = use_line_feeds ? ",\n " : ",";

Isn't this putting much more than needed in the stack? Seems like we
could just allocate tupdesc->natts members dynamically. Not sure if we
care: it's about 12 kB; maybe considering palloc overhead, using the
stack is better.

I addressed that:

A short test shows that it'd be slower to allocate nulls/values with
palloc rather than using MaxHeapAttributeNumber. Given that only output
functions are called from within composite_to_json(), I think that's ok.

Worth asking. But if this is worth doing here, then it's worth doing in
a lot more places, isn't it?

"it" being allocating values/nulls on the stack? I think there's plenty
of places that do that. But it's also worth considering whether the
relevant piece of code calls more deeply into other code, in which case
the stack usage might be more problematic.

Greetings,

Andres Freund

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#4)
Re: Don't deform column-by-column in composite_to_json

Andres Freund <andres@anarazel.de> writes:

On 2019-02-05 22:53:37 -0300, Alvaro Herrera wrote:

Isn't this putting much more than needed in the stack? Seems like we
could just allocate tupdesc->natts members dynamically. Not sure if we
care: it's about 12 kB; maybe considering palloc overhead, using the
stack is better.

"it" being allocating values/nulls on the stack? I think there's plenty
of places that do that. But it's also worth considering whether the
relevant piece of code calls more deeply into other code, in which case
the stack usage might be more problematic.

I think it's OK as long as there's a stack depth check here or somewhere
real close by.

regards, tom lane