patch: fix performance problems with repated decomprimation of varlena values in plpgsql

Started by Pavel Stehuleover 15 years ago54 messageshackers
Jump to latest
#1Pavel Stehule
pavel.stehule@gmail.com

Hello

this patch remove a multiple detoasting of varlena values in plpgsql.

It is usable mainly for iteration over longer array directly loaded
from relation.

It's doesn't have a impact on semantic or behave - it's just eliminate
some performance trap.

sample: table 10000 rows one column with array with 1000 string fields:

patched pl time: 6 sec
unpatched pl time: 170 sec

This doesn't change my opinion on FOR-IN-ARRAY cycle (is still
important for readability) - just remove one critical performance
issue

Regards

Pavel Stehule

Attachments:

varlena_accelerator.difftext/x-patch; charset=US-ASCII; name=varlena_accelerator.diffDownload+49-0
#2Andrew Dunstan
andrew@dunslane.net
In reply to: Pavel Stehule (#1)
Re: patch: fix performance problems with repated decomprimation of varlena values in plpgsql

On 11/22/2010 07:46 AM, Pavel Stehule wrote:

Hello

this patch remove a multiple detoasting of varlena values in plpgsql.

It is usable mainly for iteration over longer array directly loaded
from relation.

It's doesn't have a impact on semantic or behave - it's just eliminate
some performance trap.

sample: table 10000 rows one column with array with 1000 string fields:

patched pl time: 6 sec
unpatched pl time: 170 sec

Since you haven't told us exactly how you tested this it's hard to gauge
the test results.

cheers

andrew

#3Pavel Stehule
pavel.stehule@gmail.com
In reply to: Andrew Dunstan (#2)
Re: patch: fix performance problems with repated decomprimation of varlena values in plpgsql

2010/11/22 Andrew Dunstan <andrew@dunslane.net>:

On 11/22/2010 07:46 AM, Pavel Stehule wrote:

Hello

this patch remove a multiple detoasting of varlena values in plpgsql.

It is usable mainly for iteration over longer array directly loaded
from relation.

It's doesn't have a impact on semantic or behave - it's just eliminate
some performance trap.

sample: table 10000 rows one column with array with 1000 string fields:

patched pl time: 6 sec
unpatched pl time: 170 sec

Since you haven't told us exactly how you tested this it's hard to gauge the
test results.

sorry - it is related to tests from FOR-IN-ARRAY thread

create table t1000(x text[]);

CREATE OR REPLACE FUNCTION rndstr() RETURNS text AS $$select
array_to_string(array(select substring('ABCDEFGHIJKLMNOPQ' FROM
(random()*16)::int FOR 1) from generate_series(1,10)),'')$$ LANGUAGE
sql;

create or replace function rndarray(int) returns text[] as
$$select array(select rndstr() from generate_series(1,$1)) $$ language sql;

insert into t1000 select rndarray(1000) from generate_series(1,10000);

CREATE OR REPLACE FUNCTION public.filter02(text[], text, integer)
RETURNS text[]
LANGUAGE plpgsql
AS $function$
DECLARE
s text[] := '{}';
l int := 0; i int;
v text;
BEGIN
FOR i IN array_lower($1,1)..array_upper($1,1)
LOOP
EXIT WHEN l = $3;
IF $1[i] LIKE $2 THEN
s := s || $1[i];
l := l + 1;
END IF;
END LOOP;
RETURN s;
END;$function$

test query: select avg(array_upper(filter02(x,'%AA%', 10),1)) from t1000;

Regards

Pavel Stehule

Show quoted text

cheers

andrew

#4Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Pavel Stehule (#3)
Re: patch: fix performance problems with repated decomprimation of varlena values in plpgsql

Excerpts from Pavel Stehule's message of lun nov 22 10:01:23 -0300 2010:

sorry - it is related to tests from FOR-IN-ARRAY thread

test query: select avg(array_upper(filter02(x,'%AA%', 10),1)) from t1000;

Yeah, I can measure a 25x improvement in that test with the patch
applied.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#5Noah Misch
noah@leadboat.com
In reply to: Pavel Stehule (#1)
Re: patch: fix performance problems with repated decomprimation of varlena values in plpgsql

Hello Pavel,

I'm reviewing this patch for CommitFest 2011-01.

The patch seems fully desirable. It appropriately contains no documentation
updates. It contains no new tests, and that's probably fine, too; I can't think
of any corner cases where this would do something other than work correctly or
break things comprehensively.

Using your test case from here:
http://archives.postgresql.org/message-id/AANLkTikDCW+c-C4U4NgaOBhpFSZkb5Uy_ZuaTDZfPMSn@mail.gmail.com
I observed a 28x speedup, similar to ��lvaro's report. I also made my own test
case, attached, to evaluate this from a somewhat different angle and also to
consider the worst case. With a 100 KiB string (good case), I see a 4.8x
speedup. With a 1 KiB string (worst case - never toasted), I see no
statistically significant change. This is to be expected.

A few specific questions and comments follow, all minor. Go ahead and mark the
patch ready for committer when you've acted on them, or declined to do so, to
your satisfaction. I don't think a re-review will be needed.

Thanks,
nm

On Mon, Nov 22, 2010 at 01:46:51PM +0100, Pavel Stehule wrote:

*** ./pl_exec.c.orig	2010-11-16 10:28:42.000000000 +0100
--- ./pl_exec.c	2010-11-22 13:33:01.597726809 +0100

The patch applies cleanly, but the format is slightly nonstandard (-p0 when
applied from src/pl/plpgsql/src, rather than -p1 from the root).

***************
*** 3944,3949 ****
--- 3965,3993 ----
*typeid = var->datatype->typoid;
*typetypmod = var->datatype->atttypmod;
+ 
+ 				/*
+ 				 * explicit deTOAST and decomprim for varlena types

"decompress", perhaps?

+ 				 */
+ 				if (var->should_be_detoasted)
+ 				{
+ 					Datum dvalue;
+ 
+ 					Assert(!var->isnull);
+ 
+ 					oldcontext = MemoryContextSwitchTo(estate->fn_mcxt);
+ 					dvalue = PointerGetDatum(PG_DETOAST_DATUM(var->value));
+ 					if (dvalue != var->value)
+ 					{
+ 						if (var->freeval)
+ 							free_var(var);
+ 						var->value = dvalue;
+ 						var->freeval = true;
+ 					}
+ 					MemoryContextSwitchTo(oldcontext); 

This line adds trailing white space.

+ 					var->should_be_detoasted = false;
+ 				}
+ 
*value = var->value;
*isnull = var->isnull;
break;
*** ./plpgsql.h.orig	2010-11-16 10:28:42.000000000 +0100
--- ./plpgsql.h	2010-11-22 13:12:38.897851879 +0100
***************
*** 644,649 ****
--- 645,651 ----
bool		fn_is_trigger;
PLpgSQL_func_hashkey *fn_hashkey;	/* back-link to hashtable key */
MemoryContext fn_cxt;
+ 	MemoryContext	fn_mcxt;		/* link to function's memory context */
Oid			fn_rettype;
int			fn_rettyplen;
***************
*** 692,697 ****
--- 694,701 ----
Oid			rettype;		/* type of current retval */
Oid			fn_rettype;		/* info about declared function rettype */
+ 	MemoryContext	fn_mcxt;		/* link to function's memory context */
+ 
bool		retistuple;
bool		retisset;

I only see PLpgSQL_execstate.fn_mcxt getting populated in this patch. Is the
PLpgSQL_function.fn_mcxt leftover from an earlier design?

I could not readily tell the difference between fn_cxt and fn_mcxt. As I
understand it, fn_mcxt is the SPI procCxt, and fn_cxt is the long-lived context
used to cache facts across many transactions. Perhaps name the member something
like "top_cxt", "exec_cxt" or "proc_cxt" and comment it like "/* SPI Proc memory
context */" to make this clearer.

Attachments:

plpgsql-detoast.sqltext/plain; charset=us-asciiDownload
#6Pavel Stehule
pavel.stehule@gmail.com
In reply to: Noah Misch (#5)
Re: patch: fix performance problems with repated decomprimation of varlena values in plpgsql

Hello

2011/1/15 Noah Misch <noah@leadboat.com>:

Hello Pavel,

I'm reviewing this patch for CommitFest 2011-01.

Thank you very much,

I am sending a updated version with little bit more comments. But I am
sure, so somebody with good English have to edit my comments.
Minimally I hope, so your questions will be answered.

The patch seems fully desirable.  It appropriately contains no documentation
updates.  It contains no new tests, and that's probably fine, too; I can't think
of any corner cases where this would do something other than work correctly or
break things comprehensively.

Using your test case from here:
http://archives.postgresql.org/message-id/AANLkTikDCW+c-C4U4NgaOBhpFSZkb5Uy_ZuaTDZfPMSn@mail.gmail.com
I observed a 28x speedup, similar to Álvaro's report.  I also made my own test
case, attached, to evaluate this from a somewhat different angle and also to
consider the worst case.  With a 100 KiB string (good case), I see a 4.8x
speedup.  With a 1 KiB string (worst case - never toasted), I see no
statistically significant change.  This is to be expected.

A few specific questions and comments follow, all minor.  Go ahead and mark the
patch ready for committer when you've acted on them, or declined to do so, to
your satisfaction.  I don't think a re-review will be needed.

Thanks,
nm

On Mon, Nov 22, 2010 at 01:46:51PM +0100, Pavel Stehule wrote:

*** ./pl_exec.c.orig  2010-11-16 10:28:42.000000000 +0100
--- ./pl_exec.c       2010-11-22 13:33:01.597726809 +0100

The patch applies cleanly, but the format is slightly nonstandard (-p0 when
applied from src/pl/plpgsql/src, rather than -p1 from the root).

***************
*** 3944,3949 ****
--- 3965,3993 ----
                              *typeid = var->datatype->typoid;
                              *typetypmod = var->datatype->atttypmod;
+
+                             /*
+                              * explicit deTOAST and decomprim for varlena types

"decompress", perhaps?

fixed

+                              */
+                             if (var->should_be_detoasted)
+                             {
+                                     Datum dvalue;
+
+                                     Assert(!var->isnull);
+
+                                     oldcontext = MemoryContextSwitchTo(estate->fn_mcxt);
+                                     dvalue = PointerGetDatum(PG_DETOAST_DATUM(var->value));
+                                     if (dvalue != var->value)
+                                     {
+                                             if (var->freeval)
+                                                     free_var(var);
+                                             var->value = dvalue;
+                                             var->freeval = true;
+                                     }
+                                     MemoryContextSwitchTo(oldcontext);

This line adds trailing white space.

+                                     var->should_be_detoasted = false;
+                             }
+
                              *value = var->value;
                              *isnull = var->isnull;
                              break;
*** ./plpgsql.h.orig  2010-11-16 10:28:42.000000000 +0100
--- ./plpgsql.h       2010-11-22 13:12:38.897851879 +0100
***************
*** 644,649 ****
--- 645,651 ----
      bool            fn_is_trigger;
      PLpgSQL_func_hashkey *fn_hashkey;       /* back-link to hashtable key */
      MemoryContext fn_cxt;
+     MemoryContext   fn_mcxt;                /* link to function's memory context */
      Oid                     fn_rettype;
      int                     fn_rettyplen;
***************
*** 692,697 ****
--- 694,701 ----
      Oid                     rettype;                /* type of current retval */
      Oid                     fn_rettype;             /* info about declared function rettype */
+     MemoryContext   fn_mcxt;                /* link to function's memory context */
+
      bool            retistuple;
      bool            retisset;

I only see PLpgSQL_execstate.fn_mcxt getting populated in this patch.  Is the
PLpgSQL_function.fn_mcxt leftover from an earlier design?

I have to access to top execution context from exec_eval_datum
function. This function can be called from parser's context, and
without explicit switch to top execution context a variables are
detoasted in wrong context.

I could not readily tell the difference between fn_cxt and fn_mcxt.  As I
understand it, fn_mcxt is the SPI procCxt, and fn_cxt is the long-lived context
used to cache facts across many transactions.  Perhaps name the member something
like "top_cxt", "exec_cxt" or "proc_cxt" and comment it like "/* SPI Proc memory
context */" to make this clearer.

I used a top_exec_cxt name

Pavel Stehule
Regards

Show quoted text

Attachments:

detoast.patchtext/x-patch; charset=US-ASCII; name=detoast.patchDownload+64-2
#7Noah Misch
noah@leadboat.com
In reply to: Pavel Stehule (#6)
Re: patch: fix performance problems with repated decomprimation of varlena values in plpgsql

On Sun, Jan 16, 2011 at 06:49:27PM +0100, Pavel Stehule wrote:

I am sending a updated version with little bit more comments. But I am
sure, so somebody with good English have to edit my comments.
Minimally I hope, so your questions will be answered.

Thanks. I edited the comments and white space somewhat, as attached. I'll go
ahead and mark it Ready for Committer.

Attachments:

detoast-v3.patchtext/plain; charset=us-asciiDownload+59-2
#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Noah Misch (#7)
Re: Re: patch: fix performance problems with repated decomprimation of varlena values in plpgsql

Noah Misch <noah@leadboat.com> writes:

On Sun, Jan 16, 2011 at 06:49:27PM +0100, Pavel Stehule wrote:

I am sending a updated version with little bit more comments. But I am
sure, so somebody with good English have to edit my comments.
Minimally I hope, so your questions will be answered.

Thanks. I edited the comments and white space somewhat, as attached. I'll go
ahead and mark it Ready for Committer.

I looked at this patch and found it fairly awkward. What is the point
of adding an additional flag to every variable, as opposed to just
forcibly detoasting during assignment? If it's to avoid detoasting when
the variable is read in a way that doesn't require detoasting, it fails
rather completely IMO, since exec_eval_datum certainly doesn't know
that.

The added memory context variable seems redundant as well.

regards, tom lane

#9Pavel Stehule
pavel.stehule@gmail.com
In reply to: Tom Lane (#8)
Re: Re: patch: fix performance problems with repated decomprimation of varlena values in plpgsql

2011/1/18 Tom Lane <tgl@sss.pgh.pa.us>:

Noah Misch <noah@leadboat.com> writes:

On Sun, Jan 16, 2011 at 06:49:27PM +0100, Pavel Stehule wrote:

I am sending a updated version with little bit more comments. But I am
sure, so somebody with good English have to edit my comments.
Minimally I hope, so your questions will be answered.

Thanks.  I edited the comments and white space somewhat, as attached.  I'll go
ahead and mark it Ready for Committer.

I looked at this patch and found it fairly awkward.  What is the point
of adding an additional flag to every variable, as opposed to just
forcibly detoasting during assignment?  If it's to avoid detoasting when
the variable is read in a way that doesn't require detoasting, it fails
rather completely IMO, since exec_eval_datum certainly doesn't know
that.

I am not sure about false overhead of detoasting. This is a safe
variant. I don't would to decrease performance. Not sure if it's
necessary.

But detoasting on assignment isn't enought:

some most simple example - searching a maximum

for i in array_lower(a,1) .. array_upper(a,1)
loop
if x < a[i] then
x = a[i];
end if;
end loop;

in this cycle the variable a wasn't modified. Any access to this
variable means a detoast and decompres. So there is necessary to
modify a process. Detoast not on assign, but detoast on usage.

Regards

Pavel Stehule

The added memory context variable seems redundant as well.

I didn't find a pointer on top execution context available from
execution state. I am sure, so we have to switch to this context
explicitly.

Regards

Pavel Stehule

Show quoted text

                       regards, tom lane

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Pavel Stehule (#9)
Re: Re: patch: fix performance problems with repated decomprimation of varlena values in plpgsql

Pavel Stehule <pavel.stehule@gmail.com> writes:

2011/1/18 Tom Lane <tgl@sss.pgh.pa.us>:

I looked at this patch and found it fairly awkward.  What is the point
of adding an additional flag to every variable, as opposed to just
forcibly detoasting during assignment?

But detoasting on assignment isn't enought:

for i in array_lower(a,1) .. array_upper(a,1)
loop
if x < a[i] then
x = a[i];
end if;
end loop;

in this cycle the variable a wasn't modified. Any access to this
variable means a detoast and decompres.

How so? In what I'm envisioning, a would have been decompressed when it
was originally assigned to.

regards, tom lane

#11Pavel Stehule
pavel.stehule@gmail.com
In reply to: Tom Lane (#10)
Re: Re: patch: fix performance problems with repated decomprimation of varlena values in plpgsql

2011/1/18 Tom Lane <tgl@sss.pgh.pa.us>:

Pavel Stehule <pavel.stehule@gmail.com> writes:

2011/1/18 Tom Lane <tgl@sss.pgh.pa.us>:

I looked at this patch and found it fairly awkward.  What is the point
of adding an additional flag to every variable, as opposed to just
forcibly detoasting during assignment?

But detoasting on assignment isn't enought:

for i in array_lower(a,1) .. array_upper(a,1)
loop
  if x < a[i] then
    x = a[i];
  end if;
end loop;

in this cycle the variable a wasn't modified. Any access to this
variable means a detoast and decompres.

How so?  In what I'm envisioning, a would have been decompressed when it
was originally assigned to.

oh, I understand now. I afraid so it can be overhad, because there can
be path where you doesn't use a some variables from parameter list.

There are lot of user procedures, where not all parameters are used,
so I think is better to wait on first usage. Probably these procedures
can be written in SQL or C, but it can decrese a performance of some
current trivial functions in plpgsql. So my strategy is simple - wait
with detoasting to last moment, but don't repeat detoasting. My
opinion isn't strong in this topic. One or twenty useless detoasting
isn't really significant in almost use cases (problem is thousands
detoasting).

Regards

Pavel Stehule

Show quoted text

                       regards, tom lane

#12Robert Haas
robertmhaas@gmail.com
In reply to: Pavel Stehule (#11)
Re: Re: patch: fix performance problems with repated decomprimation of varlena values in plpgsql

On Tue, Jan 18, 2011 at 5:22 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

opinion isn't strong in this topic. One or twenty useless detoasting
isn't really significant in almost use cases (problem is thousands
detoasting).

Yeah. Many-times-repeated detoasting is really bad, and this is not
the only place in the backend where we have this problem. :-(

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#13Pavel Stehule
pavel.stehule@gmail.com
In reply to: Robert Haas (#12)
Re: Re: patch: fix performance problems with repated decomprimation of varlena values in plpgsql

2011/1/19 Robert Haas <robertmhaas@gmail.com>:

On Tue, Jan 18, 2011 at 5:22 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

opinion isn't strong in this topic. One or twenty useless detoasting
isn't really significant in almost use cases (problem is thousands
detoasting).

Yeah.  Many-times-repeated detoasting is really bad, and this is not
the only place in the backend where we have this problem.  :-(

???

some general solution can be pretty hardcore - some like handlers
instead pointers on varlena :(

I know mainly about plpgsql performance problems - but it is my
interes. Plpgsql can be simply fixed - maybe 10 maybe less lines of
code.

Pavel

Show quoted text

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#12)
Re: Re: patch: fix performance problems with repated decomprimation of varlena values in plpgsql

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Jan 18, 2011 at 5:22 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

opinion isn't strong in this topic. One or twenty useless detoasting
isn't really significant in almost use cases (problem is thousands
detoasting).

Yeah. Many-times-repeated detoasting is really bad, and this is not
the only place in the backend where we have this problem. :-(

Yeah, there's been some discussion of a more general solution, and I
think I even had a trial patch at one point (which turned out not to
work terribly well, but maybe somebody will have a better idea someday).
In the meantime, the proposal at hand seems like a bit of a stop-gap,
which is why I'd prefer to see something with a very minimal code
footprint. Detoast at assignment would likely need only a few lines
of code added in a single place.

regards, tom lane

#15Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#14)
Re: Re: patch: fix performance problems with repated decomprimation of varlena values in plpgsql

On Wed, Jan 19, 2011 at 12:10 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Jan 18, 2011 at 5:22 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

opinion isn't strong in this topic. One or twenty useless detoasting
isn't really significant in almost use cases (problem is thousands
detoasting).

Yeah.  Many-times-repeated detoasting is really bad, and this is not
the only place in the backend where we have this problem.  :-(

Yeah, there's been some discussion of a more general solution, and I
think I even had a trial patch at one point (which turned out not to
work terribly well, but maybe somebody will have a better idea someday).

I'm pretty doubtful that there's going to be a general solution to
this problem - I think it's going to require gradual refactoring of
problem spots.

In the meantime, the proposal at hand seems like a bit of a stop-gap,
which is why I'd prefer to see something with a very minimal code
footprint.  Detoast at assignment would likely need only a few lines
of code added in a single place.

OK.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#16Noah Misch
noah@leadboat.com
In reply to: Tom Lane (#14)
Re: Re: patch: fix performance problems with repated decomprimation of varlena values in plpgsql

On Wed, Jan 19, 2011 at 12:10:16PM -0500, Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Jan 18, 2011 at 5:22 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

opinion isn't strong in this topic. One or twenty useless detoasting
isn't really significant in almost use cases (problem is thousands
detoasting).

Yeah. Many-times-repeated detoasting is really bad, and this is not
the only place in the backend where we have this problem. :-(

Yeah, there's been some discussion of a more general solution, and I
think I even had a trial patch at one point (which turned out not to
work terribly well, but maybe somebody will have a better idea someday).
In the meantime, the proposal at hand seems like a bit of a stop-gap,
which is why I'd prefer to see something with a very minimal code
footprint. Detoast at assignment would likely need only a few lines
of code added in a single place.

What qualifies a patch as stop-gap scale? Pavel's patch is ~60 lines.

If adding PLpgSQL_var.should_be_detoasted is your main pain point, testing
VARATT_IS_EXTENDED there might be the least-harmful way to avoid it. Saving a
few more lines by moving the work to exec_assign_value probably does not justify
the associated performance regressions Pavel has cited.

nm

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Noah Misch (#16)
Re: Re: patch: fix performance problems with repated decomprimation of varlena values in plpgsql

Noah Misch <noah@leadboat.com> writes:

On Wed, Jan 19, 2011 at 12:10:16PM -0500, Tom Lane wrote:

In the meantime, the proposal at hand seems like a bit of a stop-gap,
which is why I'd prefer to see something with a very minimal code
footprint. Detoast at assignment would likely need only a few lines
of code added in a single place.

What qualifies a patch as stop-gap scale? Pavel's patch is ~60 lines.

Yeah, but they're spread out all over plpgsql, and seem likely to
metastasize to other places --- the additional field that needs to be
initialized is the main culprit. I'd like a one-spot patch that will
be easy to remove when/if it's no longer needed.

If adding PLpgSQL_var.should_be_detoasted is your main pain point, testing
VARATT_IS_EXTENDED there might be the least-harmful way to avoid it.

I thought about that too, but adding an additional set of tests into
exec_eval_datum isn't free --- that's a hot spot for plpgsql execution.
Doing it in exec_assign_value would be significantly cheaper, first
because it's reasonable to assume that assignments are less frequent
than reads, and second because there's already a test there to detect
pass-by-ref datatypes, as well as a datumCopy() step that could be
skipped altogether when we detoast.

regards, tom lane

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#15)
Re: Re: patch: fix performance problems with repated decomprimation of varlena values in plpgsql

Robert Haas <robertmhaas@gmail.com> writes:

On Wed, Jan 19, 2011 at 12:10 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

Yeah. Many-times-repeated detoasting is really bad, and this is not
the only place in the backend where we have this problem. :-(

Yeah, there's been some discussion of a more general solution, and I
think I even had a trial patch at one point (which turned out not to
work terribly well, but maybe somebody will have a better idea someday).

I'm pretty doubtful that there's going to be a general solution to
this problem - I think it's going to require gradual refactoring of
problem spots.

Do you remember the previous discussion? One idea that was on the table
was to make the TOAST code maintain a cache of detoasted values, which
could be indexed by the toast pointer OIDs (toast rel OID + value OID),
and then PG_DETOAST_DATUM might give back a pointer into the cache
instead of a fresh value. In principle that could be done in a fairly
centralized way. The hard part is to know when a cache entry is not
actively referenced anymore ...

regards, tom lane

#19Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#18)
Re: Re: patch: fix performance problems with repated decomprimation of varlena values in plpgsql

On Wed, Jan 19, 2011 at 2:58 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Wed, Jan 19, 2011 at 12:10 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

Yeah.  Many-times-repeated detoasting is really bad, and this is not
the only place in the backend where we have this problem.  :-(

Yeah, there's been some discussion of a more general solution, and I
think I even had a trial patch at one point (which turned out not to
work terribly well, but maybe somebody will have a better idea someday).

I'm pretty doubtful that there's going to be a general solution to
this problem - I think it's going to require gradual refactoring of
problem spots.

Do you remember the previous discussion?  One idea that was on the table
was to make the TOAST code maintain a cache of detoasted values, which
could be indexed by the toast pointer OIDs (toast rel OID + value OID),
and then PG_DETOAST_DATUM might give back a pointer into the cache
instead of a fresh value.  In principle that could be done in a fairly
centralized way.  The hard part is to know when a cache entry is not
actively referenced anymore ...

I do remember that discussion. Aside from the problem you mention, it
also seems that maintaining the hash table and doing lookups into it
would have some intrinsic cost.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#19)
Re: Re: patch: fix performance problems with repated decomprimation of varlena values in plpgsql

Robert Haas <robertmhaas@gmail.com> writes:

I do remember that discussion. Aside from the problem you mention, it
also seems that maintaining the hash table and doing lookups into it
would have some intrinsic cost.

Well, sure, but it's still far cheaper than going out to the toast table
(which will require multiple hashtable lookups, not to mention I/O and
possible lock blocking). If we could solve the refcounting problem I
think this would be a very significant win.

regards, tom lane

#21Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Tom Lane (#20)
#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: Kevin Grittner (#21)
#23Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#22)
#24Pavel Stehule
pavel.stehule@gmail.com
In reply to: Robert Haas (#23)
#25Noah Misch
noah@leadboat.com
In reply to: Pavel Stehule (#24)
#26Pavel Stehule
pavel.stehule@gmail.com
In reply to: Noah Misch (#25)
#27Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Pavel Stehule (#26)
#28Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#27)
#29Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#28)
#30Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#29)
#31Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#30)
#32Pavel Stehule
pavel.stehule@gmail.com
In reply to: Tom Lane (#30)
#33Robert Haas
robertmhaas@gmail.com
In reply to: Pavel Stehule (#32)
#34Pavel Stehule
pavel.stehule@gmail.com
In reply to: Robert Haas (#33)
#35Robert Haas
robertmhaas@gmail.com
In reply to: Pavel Stehule (#34)
#36Pavel Stehule
pavel.stehule@gmail.com
In reply to: Robert Haas (#35)
#37Noah Misch
noah@leadboat.com
In reply to: Pavel Stehule (#36)
#38Robert Haas
robertmhaas@gmail.com
In reply to: Noah Misch (#37)
#39Noah Misch
noah@leadboat.com
In reply to: Robert Haas (#38)
#40Robert Haas
robertmhaas@gmail.com
In reply to: Noah Misch (#39)
#41Robert Haas
robertmhaas@gmail.com
In reply to: Robert Haas (#40)
#42Noah Misch
noah@leadboat.com
In reply to: Robert Haas (#41)
#43Pavel Stehule
pavel.stehule@gmail.com
In reply to: Robert Haas (#41)
#44Pavel Stehule
pavel.stehule@gmail.com
In reply to: Noah Misch (#42)
#45Robert Haas
robertmhaas@gmail.com
In reply to: Noah Misch (#42)
#46Noah Misch
noah@leadboat.com
In reply to: Pavel Stehule (#44)
#47Pavel Stehule
pavel.stehule@gmail.com
In reply to: Noah Misch (#46)
#48Pavel Stehule
pavel.stehule@gmail.com
In reply to: Robert Haas (#45)
#49Noah Misch
noah@leadboat.com
In reply to: Robert Haas (#45)
#50Pavel Stehule
pavel.stehule@gmail.com
In reply to: Noah Misch (#49)
#51Robert Haas
robertmhaas@gmail.com
In reply to: Noah Misch (#49)
#52Bruce Momjian
bruce@momjian.us
In reply to: Pavel Stehule (#1)
#53Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#52)
#54Pavel Stehule
pavel.stehule@gmail.com
In reply to: Bruce Momjian (#52)