patch: fix performance problems with repated decomprimation of varlena values in plpgsql
Hello
this patch remove a multiple detoasting of varlena values in plpgsql.
It is usable mainly for iteration over longer array directly loaded
from relation.
It's doesn't have a impact on semantic or behave - it's just eliminate
some performance trap.
sample: table 10000 rows one column with array with 1000 string fields:
patched pl time: 6 sec
unpatched pl time: 170 sec
This doesn't change my opinion on FOR-IN-ARRAY cycle (is still
important for readability) - just remove one critical performance
issue
Regards
Pavel Stehule
Attachments:
varlena_accelerator.difftext/x-patch; charset=US-ASCII; name=varlena_accelerator.diffDownload+49-0
On 11/22/2010 07:46 AM, Pavel Stehule wrote:
Hello
this patch remove a multiple detoasting of varlena values in plpgsql.
It is usable mainly for iteration over longer array directly loaded
from relation.It's doesn't have a impact on semantic or behave - it's just eliminate
some performance trap.sample: table 10000 rows one column with array with 1000 string fields:
patched pl time: 6 sec
unpatched pl time: 170 sec
Since you haven't told us exactly how you tested this it's hard to gauge
the test results.
cheers
andrew
2010/11/22 Andrew Dunstan <andrew@dunslane.net>:
On 11/22/2010 07:46 AM, Pavel Stehule wrote:
Hello
this patch remove a multiple detoasting of varlena values in plpgsql.
It is usable mainly for iteration over longer array directly loaded
from relation.It's doesn't have a impact on semantic or behave - it's just eliminate
some performance trap.sample: table 10000 rows one column with array with 1000 string fields:
patched pl time: 6 sec
unpatched pl time: 170 secSince you haven't told us exactly how you tested this it's hard to gauge the
test results.
sorry - it is related to tests from FOR-IN-ARRAY thread
create table t1000(x text[]);
CREATE OR REPLACE FUNCTION rndstr() RETURNS text AS $$select
array_to_string(array(select substring('ABCDEFGHIJKLMNOPQ' FROM
(random()*16)::int FOR 1) from generate_series(1,10)),'')$$ LANGUAGE
sql;
create or replace function rndarray(int) returns text[] as
$$select array(select rndstr() from generate_series(1,$1)) $$ language sql;
insert into t1000 select rndarray(1000) from generate_series(1,10000);
CREATE OR REPLACE FUNCTION public.filter02(text[], text, integer)
RETURNS text[]
LANGUAGE plpgsql
AS $function$
DECLARE
s text[] := '{}';
l int := 0; i int;
v text;
BEGIN
FOR i IN array_lower($1,1)..array_upper($1,1)
LOOP
EXIT WHEN l = $3;
IF $1[i] LIKE $2 THEN
s := s || $1[i];
l := l + 1;
END IF;
END LOOP;
RETURN s;
END;$function$
test query: select avg(array_upper(filter02(x,'%AA%', 10),1)) from t1000;
Regards
Pavel Stehule
Show quoted text
cheers
andrew
Excerpts from Pavel Stehule's message of lun nov 22 10:01:23 -0300 2010:
sorry - it is related to tests from FOR-IN-ARRAY thread
test query: select avg(array_upper(filter02(x,'%AA%', 10),1)) from t1000;
Yeah, I can measure a 25x improvement in that test with the patch
applied.
--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Hello Pavel,
I'm reviewing this patch for CommitFest 2011-01.
The patch seems fully desirable. It appropriately contains no documentation
updates. It contains no new tests, and that's probably fine, too; I can't think
of any corner cases where this would do something other than work correctly or
break things comprehensively.
Using your test case from here:
http://archives.postgresql.org/message-id/AANLkTikDCW+c-C4U4NgaOBhpFSZkb5Uy_ZuaTDZfPMSn@mail.gmail.com
I observed a 28x speedup, similar to ��lvaro's report. I also made my own test
case, attached, to evaluate this from a somewhat different angle and also to
consider the worst case. With a 100 KiB string (good case), I see a 4.8x
speedup. With a 1 KiB string (worst case - never toasted), I see no
statistically significant change. This is to be expected.
A few specific questions and comments follow, all minor. Go ahead and mark the
patch ready for committer when you've acted on them, or declined to do so, to
your satisfaction. I don't think a re-review will be needed.
Thanks,
nm
On Mon, Nov 22, 2010 at 01:46:51PM +0100, Pavel Stehule wrote:
*** ./pl_exec.c.orig 2010-11-16 10:28:42.000000000 +0100 --- ./pl_exec.c 2010-11-22 13:33:01.597726809 +0100
The patch applies cleanly, but the format is slightly nonstandard (-p0 when
applied from src/pl/plpgsql/src, rather than -p1 from the root).
*************** *** 3944,3949 **** --- 3965,3993 ----*typeid = var->datatype->typoid; *typetypmod = var->datatype->atttypmod; + + /* + * explicit deTOAST and decomprim for varlena types
"decompress", perhaps?
+ */ + if (var->should_be_detoasted) + { + Datum dvalue; + + Assert(!var->isnull); + + oldcontext = MemoryContextSwitchTo(estate->fn_mcxt); + dvalue = PointerGetDatum(PG_DETOAST_DATUM(var->value)); + if (dvalue != var->value) + { + if (var->freeval) + free_var(var); + var->value = dvalue; + var->freeval = true; + } + MemoryContextSwitchTo(oldcontext);
This line adds trailing white space.
+ var->should_be_detoasted = false; + } + *value = var->value; *isnull = var->isnull; break;
*** ./plpgsql.h.orig 2010-11-16 10:28:42.000000000 +0100 --- ./plpgsql.h 2010-11-22 13:12:38.897851879 +0100
*************** *** 644,649 **** --- 645,651 ---- bool fn_is_trigger; PLpgSQL_func_hashkey *fn_hashkey; /* back-link to hashtable key */ MemoryContext fn_cxt; + MemoryContext fn_mcxt; /* link to function's memory context */Oid fn_rettype; int fn_rettyplen; *************** *** 692,697 **** --- 694,701 ---- Oid rettype; /* type of current retval */Oid fn_rettype; /* info about declared function rettype */ + MemoryContext fn_mcxt; /* link to function's memory context */ + bool retistuple; bool retisset;
I only see PLpgSQL_execstate.fn_mcxt getting populated in this patch. Is the
PLpgSQL_function.fn_mcxt leftover from an earlier design?
I could not readily tell the difference between fn_cxt and fn_mcxt. As I
understand it, fn_mcxt is the SPI procCxt, and fn_cxt is the long-lived context
used to cache facts across many transactions. Perhaps name the member something
like "top_cxt", "exec_cxt" or "proc_cxt" and comment it like "/* SPI Proc memory
context */" to make this clearer.
Attachments:
plpgsql-detoast.sqltext/plain; charset=us-asciiDownload
Hello
2011/1/15 Noah Misch <noah@leadboat.com>:
Hello Pavel,
I'm reviewing this patch for CommitFest 2011-01.
Thank you very much,
I am sending a updated version with little bit more comments. But I am
sure, so somebody with good English have to edit my comments.
Minimally I hope, so your questions will be answered.
The patch seems fully desirable. It appropriately contains no documentation
updates. It contains no new tests, and that's probably fine, too; I can't think
of any corner cases where this would do something other than work correctly or
break things comprehensively.Using your test case from here:
http://archives.postgresql.org/message-id/AANLkTikDCW+c-C4U4NgaOBhpFSZkb5Uy_ZuaTDZfPMSn@mail.gmail.com
I observed a 28x speedup, similar to Álvaro's report. I also made my own test
case, attached, to evaluate this from a somewhat different angle and also to
consider the worst case. With a 100 KiB string (good case), I see a 4.8x
speedup. With a 1 KiB string (worst case - never toasted), I see no
statistically significant change. This is to be expected.A few specific questions and comments follow, all minor. Go ahead and mark the
patch ready for committer when you've acted on them, or declined to do so, to
your satisfaction. I don't think a re-review will be needed.Thanks,
nmOn Mon, Nov 22, 2010 at 01:46:51PM +0100, Pavel Stehule wrote:
*** ./pl_exec.c.orig 2010-11-16 10:28:42.000000000 +0100 --- ./pl_exec.c 2010-11-22 13:33:01.597726809 +0100The patch applies cleanly, but the format is slightly nonstandard (-p0 when
applied from src/pl/plpgsql/src, rather than -p1 from the root).*************** *** 3944,3949 **** --- 3965,3993 ----*typeid = var->datatype->typoid; *typetypmod = var->datatype->atttypmod; + + /* + * explicit deTOAST and decomprim for varlena types"decompress", perhaps?
fixed
+ */ + if (var->should_be_detoasted) + { + Datum dvalue; + + Assert(!var->isnull); + + oldcontext = MemoryContextSwitchTo(estate->fn_mcxt); + dvalue = PointerGetDatum(PG_DETOAST_DATUM(var->value)); + if (dvalue != var->value) + { + if (var->freeval) + free_var(var); + var->value = dvalue; + var->freeval = true; + } + MemoryContextSwitchTo(oldcontext);This line adds trailing white space.
+ var->should_be_detoasted = false; + } + *value = var->value; *isnull = var->isnull; break;*** ./plpgsql.h.orig 2010-11-16 10:28:42.000000000 +0100 --- ./plpgsql.h 2010-11-22 13:12:38.897851879 +0100*************** *** 644,649 **** --- 645,651 ---- bool fn_is_trigger; PLpgSQL_func_hashkey *fn_hashkey; /* back-link to hashtable key */ MemoryContext fn_cxt; + MemoryContext fn_mcxt; /* link to function's memory context */Oid fn_rettype; int fn_rettyplen; *************** *** 692,697 **** --- 694,701 ---- Oid rettype; /* type of current retval */Oid fn_rettype; /* info about declared function rettype */ + MemoryContext fn_mcxt; /* link to function's memory context */ + bool retistuple; bool retisset;I only see PLpgSQL_execstate.fn_mcxt getting populated in this patch. Is the
PLpgSQL_function.fn_mcxt leftover from an earlier design?
I have to access to top execution context from exec_eval_datum
function. This function can be called from parser's context, and
without explicit switch to top execution context a variables are
detoasted in wrong context.
I could not readily tell the difference between fn_cxt and fn_mcxt. As I
understand it, fn_mcxt is the SPI procCxt, and fn_cxt is the long-lived context
used to cache facts across many transactions. Perhaps name the member something
like "top_cxt", "exec_cxt" or "proc_cxt" and comment it like "/* SPI Proc memory
context */" to make this clearer.
I used a top_exec_cxt name
Pavel Stehule
Regards
Show quoted text
Attachments:
detoast.patchtext/x-patch; charset=US-ASCII; name=detoast.patchDownload+64-2
On Sun, Jan 16, 2011 at 06:49:27PM +0100, Pavel Stehule wrote:
I am sending a updated version with little bit more comments. But I am
sure, so somebody with good English have to edit my comments.
Minimally I hope, so your questions will be answered.
Thanks. I edited the comments and white space somewhat, as attached. I'll go
ahead and mark it Ready for Committer.
Attachments:
detoast-v3.patchtext/plain; charset=us-asciiDownload+59-2
Noah Misch <noah@leadboat.com> writes:
On Sun, Jan 16, 2011 at 06:49:27PM +0100, Pavel Stehule wrote:
I am sending a updated version with little bit more comments. But I am
sure, so somebody with good English have to edit my comments.
Minimally I hope, so your questions will be answered.
Thanks. I edited the comments and white space somewhat, as attached. I'll go
ahead and mark it Ready for Committer.
I looked at this patch and found it fairly awkward. What is the point
of adding an additional flag to every variable, as opposed to just
forcibly detoasting during assignment? If it's to avoid detoasting when
the variable is read in a way that doesn't require detoasting, it fails
rather completely IMO, since exec_eval_datum certainly doesn't know
that.
The added memory context variable seems redundant as well.
regards, tom lane
2011/1/18 Tom Lane <tgl@sss.pgh.pa.us>:
Noah Misch <noah@leadboat.com> writes:
On Sun, Jan 16, 2011 at 06:49:27PM +0100, Pavel Stehule wrote:
I am sending a updated version with little bit more comments. But I am
sure, so somebody with good English have to edit my comments.
Minimally I hope, so your questions will be answered.Thanks. I edited the comments and white space somewhat, as attached. I'll go
ahead and mark it Ready for Committer.I looked at this patch and found it fairly awkward. What is the point
of adding an additional flag to every variable, as opposed to just
forcibly detoasting during assignment? If it's to avoid detoasting when
the variable is read in a way that doesn't require detoasting, it fails
rather completely IMO, since exec_eval_datum certainly doesn't know
that.
I am not sure about false overhead of detoasting. This is a safe
variant. I don't would to decrease performance. Not sure if it's
necessary.
But detoasting on assignment isn't enought:
some most simple example - searching a maximum
for i in array_lower(a,1) .. array_upper(a,1)
loop
if x < a[i] then
x = a[i];
end if;
end loop;
in this cycle the variable a wasn't modified. Any access to this
variable means a detoast and decompres. So there is necessary to
modify a process. Detoast not on assign, but detoast on usage.
Regards
Pavel Stehule
The added memory context variable seems redundant as well.
I didn't find a pointer on top execution context available from
execution state. I am sure, so we have to switch to this context
explicitly.
Regards
Pavel Stehule
Show quoted text
regards, tom lane
Pavel Stehule <pavel.stehule@gmail.com> writes:
2011/1/18 Tom Lane <tgl@sss.pgh.pa.us>:
I looked at this patch and found it fairly awkward. What is the point
of adding an additional flag to every variable, as opposed to just
forcibly detoasting during assignment?
But detoasting on assignment isn't enought:
for i in array_lower(a,1) .. array_upper(a,1)
loop
if x < a[i] then
x = a[i];
end if;
end loop;
in this cycle the variable a wasn't modified. Any access to this
variable means a detoast and decompres.
How so? In what I'm envisioning, a would have been decompressed when it
was originally assigned to.
regards, tom lane
2011/1/18 Tom Lane <tgl@sss.pgh.pa.us>:
Pavel Stehule <pavel.stehule@gmail.com> writes:
2011/1/18 Tom Lane <tgl@sss.pgh.pa.us>:
I looked at this patch and found it fairly awkward. What is the point
of adding an additional flag to every variable, as opposed to just
forcibly detoasting during assignment?But detoasting on assignment isn't enought:
for i in array_lower(a,1) .. array_upper(a,1)
loop
if x < a[i] then
x = a[i];
end if;
end loop;in this cycle the variable a wasn't modified. Any access to this
variable means a detoast and decompres.How so? In what I'm envisioning, a would have been decompressed when it
was originally assigned to.
oh, I understand now. I afraid so it can be overhad, because there can
be path where you doesn't use a some variables from parameter list.
There are lot of user procedures, where not all parameters are used,
so I think is better to wait on first usage. Probably these procedures
can be written in SQL or C, but it can decrese a performance of some
current trivial functions in plpgsql. So my strategy is simple - wait
with detoasting to last moment, but don't repeat detoasting. My
opinion isn't strong in this topic. One or twenty useless detoasting
isn't really significant in almost use cases (problem is thousands
detoasting).
Regards
Pavel Stehule
Show quoted text
regards, tom lane
On Tue, Jan 18, 2011 at 5:22 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
opinion isn't strong in this topic. One or twenty useless detoasting
isn't really significant in almost use cases (problem is thousands
detoasting).
Yeah. Many-times-repeated detoasting is really bad, and this is not
the only place in the backend where we have this problem. :-(
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
2011/1/19 Robert Haas <robertmhaas@gmail.com>:
On Tue, Jan 18, 2011 at 5:22 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
opinion isn't strong in this topic. One or twenty useless detoasting
isn't really significant in almost use cases (problem is thousands
detoasting).Yeah. Many-times-repeated detoasting is really bad, and this is not
the only place in the backend where we have this problem. :-(
???
some general solution can be pretty hardcore - some like handlers
instead pointers on varlena :(
I know mainly about plpgsql performance problems - but it is my
interes. Plpgsql can be simply fixed - maybe 10 maybe less lines of
code.
Pavel
Show quoted text
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes:
On Tue, Jan 18, 2011 at 5:22 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
opinion isn't strong in this topic. One or twenty useless detoasting
isn't really significant in almost use cases (problem is thousands
detoasting).
Yeah. Many-times-repeated detoasting is really bad, and this is not
the only place in the backend where we have this problem. :-(
Yeah, there's been some discussion of a more general solution, and I
think I even had a trial patch at one point (which turned out not to
work terribly well, but maybe somebody will have a better idea someday).
In the meantime, the proposal at hand seems like a bit of a stop-gap,
which is why I'd prefer to see something with a very minimal code
footprint. Detoast at assignment would likely need only a few lines
of code added in a single place.
regards, tom lane
On Wed, Jan 19, 2011 at 12:10 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
On Tue, Jan 18, 2011 at 5:22 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
opinion isn't strong in this topic. One or twenty useless detoasting
isn't really significant in almost use cases (problem is thousands
detoasting).Yeah. Many-times-repeated detoasting is really bad, and this is not
the only place in the backend where we have this problem. :-(Yeah, there's been some discussion of a more general solution, and I
think I even had a trial patch at one point (which turned out not to
work terribly well, but maybe somebody will have a better idea someday).
I'm pretty doubtful that there's going to be a general solution to
this problem - I think it's going to require gradual refactoring of
problem spots.
In the meantime, the proposal at hand seems like a bit of a stop-gap,
which is why I'd prefer to see something with a very minimal code
footprint. Detoast at assignment would likely need only a few lines
of code added in a single place.
OK.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Wed, Jan 19, 2011 at 12:10:16PM -0500, Tom Lane wrote:
Robert Haas <robertmhaas@gmail.com> writes:
On Tue, Jan 18, 2011 at 5:22 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
opinion isn't strong in this topic. One or twenty useless detoasting
isn't really significant in almost use cases (problem is thousands
detoasting).Yeah. Many-times-repeated detoasting is really bad, and this is not
the only place in the backend where we have this problem. :-(Yeah, there's been some discussion of a more general solution, and I
think I even had a trial patch at one point (which turned out not to
work terribly well, but maybe somebody will have a better idea someday).
In the meantime, the proposal at hand seems like a bit of a stop-gap,
which is why I'd prefer to see something with a very minimal code
footprint. Detoast at assignment would likely need only a few lines
of code added in a single place.
What qualifies a patch as stop-gap scale? Pavel's patch is ~60 lines.
If adding PLpgSQL_var.should_be_detoasted is your main pain point, testing
VARATT_IS_EXTENDED there might be the least-harmful way to avoid it. Saving a
few more lines by moving the work to exec_assign_value probably does not justify
the associated performance regressions Pavel has cited.
nm
Noah Misch <noah@leadboat.com> writes:
On Wed, Jan 19, 2011 at 12:10:16PM -0500, Tom Lane wrote:
In the meantime, the proposal at hand seems like a bit of a stop-gap,
which is why I'd prefer to see something with a very minimal code
footprint. Detoast at assignment would likely need only a few lines
of code added in a single place.
What qualifies a patch as stop-gap scale? Pavel's patch is ~60 lines.
Yeah, but they're spread out all over plpgsql, and seem likely to
metastasize to other places --- the additional field that needs to be
initialized is the main culprit. I'd like a one-spot patch that will
be easy to remove when/if it's no longer needed.
If adding PLpgSQL_var.should_be_detoasted is your main pain point, testing
VARATT_IS_EXTENDED there might be the least-harmful way to avoid it.
I thought about that too, but adding an additional set of tests into
exec_eval_datum isn't free --- that's a hot spot for plpgsql execution.
Doing it in exec_assign_value would be significantly cheaper, first
because it's reasonable to assume that assignments are less frequent
than reads, and second because there's already a test there to detect
pass-by-ref datatypes, as well as a datumCopy() step that could be
skipped altogether when we detoast.
regards, tom lane
Robert Haas <robertmhaas@gmail.com> writes:
On Wed, Jan 19, 2011 at 12:10 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
Yeah. Many-times-repeated detoasting is really bad, and this is not
the only place in the backend where we have this problem. :-(
Yeah, there's been some discussion of a more general solution, and I
think I even had a trial patch at one point (which turned out not to
work terribly well, but maybe somebody will have a better idea someday).
I'm pretty doubtful that there's going to be a general solution to
this problem - I think it's going to require gradual refactoring of
problem spots.
Do you remember the previous discussion? One idea that was on the table
was to make the TOAST code maintain a cache of detoasted values, which
could be indexed by the toast pointer OIDs (toast rel OID + value OID),
and then PG_DETOAST_DATUM might give back a pointer into the cache
instead of a fresh value. In principle that could be done in a fairly
centralized way. The hard part is to know when a cache entry is not
actively referenced anymore ...
regards, tom lane
On Wed, Jan 19, 2011 at 2:58 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
On Wed, Jan 19, 2011 at 12:10 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
Yeah. Many-times-repeated detoasting is really bad, and this is not
the only place in the backend where we have this problem. :-(Yeah, there's been some discussion of a more general solution, and I
think I even had a trial patch at one point (which turned out not to
work terribly well, but maybe somebody will have a better idea someday).I'm pretty doubtful that there's going to be a general solution to
this problem - I think it's going to require gradual refactoring of
problem spots.Do you remember the previous discussion? One idea that was on the table
was to make the TOAST code maintain a cache of detoasted values, which
could be indexed by the toast pointer OIDs (toast rel OID + value OID),
and then PG_DETOAST_DATUM might give back a pointer into the cache
instead of a fresh value. In principle that could be done in a fairly
centralized way. The hard part is to know when a cache entry is not
actively referenced anymore ...
I do remember that discussion. Aside from the problem you mention, it
also seems that maintaining the hash table and doing lookups into it
would have some intrinsic cost.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes:
I do remember that discussion. Aside from the problem you mention, it
also seems that maintaining the hash table and doing lookups into it
would have some intrinsic cost.
Well, sure, but it's still far cheaper than going out to the toast table
(which will require multiple hashtable lookups, not to mention I/O and
possible lock blocking). If we could solve the refcounting problem I
think this would be a very significant win.
regards, tom lane