Manipulating complex types as non-contiguous structures in-memory
I've been fooling around with a design to support computation-oriented,
not-necessarily-contiguous-blobs representations of datatypes in memory,
along the lines I mentioned here:
/messages/by-id/2355.1382710707@sss.pgh.pa.us
In particular this is meant to reduce the overhead for repeated operations
on arrays, records, etc. We've had several previous discussions about
that, and even some single-purpose patches such as in this thread:
/messages/by-id/CAFj8pRAKuDU_0md-dg6Ftk0wSupvMLyrV1PB+HyC+GUBZz346w@mail.gmail.com
There was also a thread discussing how this sort of thing could be
useful to PostGIS:
/messages/by-id/526A61FB.1050209@oslandia.com
and it's been discussed a few other times too, but I'm too lazy to
search the archives any further.
I've now taken this idea as far as building the required infrastructure
and revamping a couple of array operators to use it. There's a lot yet
to do, but I've done enough to get some preliminary ideas about
performance (see below).
The core ideas of this patch are:
* Invent a new TOAST datum type "pointer to deserialized object", which
is physically just like the existing indirect-toast-pointer concept, but
it has a different va_tag code and somewhat different semantics.
* A deserialized object has a standard header (which is what the toast
pointers point to) and typically will have additional data-type-specific
fields after that. One component of the standard header is a pointer to
a set of "method" functions that provide ways to accomplish standard
data-type-independent operations on the deserialized object.
* Another standard header component is a MemoryContext identifier: the
header, as well as all subsidiary data belonging to the deserialized
object, must live in this context. (Well, I guess there could also be
child contexts.) By exposing an explicit context identifier, we can
accomplish tasks like "move this object into another context" by
reparenting the object's context rather than physically copying anything.
* The only standard "methods" I've found a need for so far are functions
to re-serialize the object, that is generate a plain varlena value that is
semantically equivalent. To avoid extra copying, this is split into
separate "compute the space needed" and "serialize into this memory"
steps, so that the result can be dropped exactly where the caller needs
it.
* Currently, a deserialized object will be reserialized in that way
whenever we incorporate it into a physical tuple (ie, heap_form_tuple
or index_form_tuple), or whenever somebody applies datumCopy() to it.
I'd like to relax this later, but there's an awful lot of code that
supposes that heap_form_tuple or datumCopy will produce a self-contained
value that survives beyond, eg, destruction of the memory context that
contained the source Datums. We can get good speedups in a lot of
interesting cases without solving that problem, so I don't feel too bad
about leaving it as a future project.
* In particular, things like PG_GETARG_ARRAYTYPE_P() treat a deserialized
toast pointer as something to be detoasted, and will produce a palloc'd
re-serialized value. This means that we do not need to convert all the
C functions concerned with a given datatype at the same time (or indeed
ever); a function that hasn't been upgraded will build a re-serialized
representation and then operate on that. We'll invent alternate
argument-fetching functions that skip the reserialization step, for use
by functions that have been upgraded to handle either case. This is
basically the same approach we used when we introduced short varlena
headers, and that seems to have gone smoothly enough.
* There's a concept that a deserialized object has a "primary" toast
pointer, which is physically part of the object, as well as "secondary"
toast pointers which might or might not be part of the object. If you
have a Datum pointer to the primary toast pointer then you are authorized
to modify the object in-place; if you have a Datum pointer to a secondary
toast pointer then you must treat it as read-only (ie, you have to make a
copy if you're going to change it). Functions that construct a new
deserialized object always return its primary toast pointer; this allows a
nest of functions to modify an object in-place without copying, which was
the primary need that the PostGIS folks expressed. On the other hand,
plpgsql can hand out secondary toast pointers to deserialized objects
stored in plpgsql function variables, thus ensuring that the objects won't
be modified unexpectedly, while never having to physically copy them if
the called functions just need to inspect them.
* Primary and secondary pointers are physically identical, but the
primary pointer resides in a specific spot in the deserialized object's
standard header. (So you can tell if you've got the primary pointer via
a simple address comparison.)
* I've modified the array element assignment path in plpgsql's
exec_assign_value so that, instead of passing a secondary toast pointer
to array_set() as you might expect from the above, it passes the primary
toast pointer thus allowing array_set() to modify the variable in-place.
So an operation like "array_variable[x] := y" no longer incurs recopying
of the whole array, once the variable has been converted into deserialized
form. (If it's not yet, it becomes deserialized after the first such
assignment.) Also, assignment of an already-deserialized value to a
variable accomplishes that with a MemoryContext parent pointer swing
instead of physical copying, if what we have is the primary toast pointer,
which implies it's not referenced anywhere else.
* Any functions that plpgsql gives a read/write pointer to need to be
exceedingly careful to not leave a corrupted object behind if they fail
partway through. I've successfully written such a version of array_set(),
and it wasn't too painful, but this may be a limitation on the general
applicability of the whole approach.
* In the current patch, that restriction only applies to array_set()
anyway. But I would like to allow in-place updates for non-core cases.
For example in something like
hstore_var := hstore_var || 'foo=>bar';
we could plausibly pass a R/W pointer to hstore_concat and let it modify
hstore_var in place. But this would require knowing which such functions
are safe, or assuming that they all are, which might be an onerous
restriction.
* I soon noticed that I was getting a lot of empty "deserialized array"
contexts floating around. The attached patch addresses this in a quick
hack fashion by redefining ResetExprContext() to use
MemoryContextResetAndDeleteChildren() instead of MemoryContextReset(),
so that deserialized objects created within an expression evaluation
context go completely away at ResetExprContext(), rather than being left
behind as empty subcontext shells. We've talked more than once about
redefining mcxt.c's API so that MemoryContextReset() means what's
currently meant by MemoryContextResetAndDeleteChildren(), and if you
really truly do want to keep empty child contexts around then you need to
call something else instead. I did not go that far here, but I think we
should seriously consider biting the bullet and finally changing it.
* Although I said above that everything owned by a deserialized object
has to live in a single memory context, I do have ideas about relaxing
that. The core idea would be to invent a "memory context reset/delete
callback" feature in mcxt.c. Then a deserialized object could register
such a callback on its own memory context, and use the callback to clean
up resources outside its context. This is potentially useful for instance
for something like PostGIS, where an object likely includes some data that
was allocated with malloc not palloc because it was created by library
functions that aren't Postgres-aware. Another likely use-case is for
deserialized objects representing composite types to maintain reference
counts on their tuple descriptors instead of having to copy said
descriptors into their private contexts. This'd be material for a
separate patch though.
So that's the plan, and attached is a very-much-WIP patch that uses this
approach to speed up plpgsql array element assignments (and not a whole
lot else as yet). Here's the basic test case I've been using:
create or replace function arraysetint(n int) returns int[] as $$
declare res int[] := '{}';
begin
for i in 1 .. n loop
res[i] := i;
end loop;
return res;
end
$$ language plpgsql strict;
In HEAD, this function's runtime grows as O(N^2), so for example
(with casserts off on my x86_64 workstation):
regression=# select array_dims(arraysetint(100000));
array_dims
------------
[1:100000]
(1 row)
Time: 7874.070 ms
With variable-length array elements, such as if you change the
int[] arrays to numeric[], it's even worse:
regression=# select array_dims(arraysetnum(100000));
array_dims
------------
[1:100000]
(1 row)
Time: 31177.340 ms
With the attached patch, those timings drop to 80 and 150 ms respectively.
It's not all peaches and cream: for the array_append operator, which is
also accelerated by the patch (mainly because it is too much in bed with
array_set to not fix at the same time ;-)), I tried examples like
explain analyze select array[1,2] || g || g || g from generate_series(1,1000000) g;
Very roughly, HEAD needs about 400 ns per || operator in this scenario.
With the patch, it's about 480 ns for the first operator and then 200 more
for each one accepting a prior operator's output. (Those numbers could
perhaps be improved with more-invasive refactoring of the array code.)
The extra initial overhead represents the time to convert the array[1,2]
constant to deserialized form during each execution of the first operator.
Still, if the worst-case slowdown is around 20% on trivially-sized arrays,
I'd gladly take that to have better performance on larger arrays. And I
think this example is close to the worst case for the patch's approach,
since it's testing small, fixed-element-length, no-nulls arrays, which is
what the existing code can handle without spending a lot of cycles.
Note that I've kept all the deserialized-array-specific code in its own file
for now, just for ease of hacking. That stuff would need to propagate into
the main array-related files in a more complete patch.
BTW, I'm not all that thrilled with the "deserialized object" terminology.
I found myself repeatedly tripping up on which form was serialized and
which de-. If anyone's got a better naming idea I'm willing to adopt it.
I'm not sure exactly how to push this forward. I would not want to
commit it without converting a significant number of array functions to
understand about deserialized inputs, and by the time I've finished that
work it's likely to be too late for 9.5. OTOH I'm sure that the PostGIS
folk would love to have this infrastructure in 9.5 not 9.6 so they could
make a start on fixing their issues. (Further down the pike, I'd plan to
look at adapting composite-type operations, JSONB, etc, to make use of
this approach, but that certainly isn't happening for 9.5.)
Thoughts, advice, better ideas?
regards, tom lane
Attachments:
deserialized-arrays-0.1.patchtext/x-diff; charset=us-ascii; name=deserialized-arrays-0.1.patchDownload+1655-216
Tom,
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
I've now taken this idea as far as building the required infrastructure
and revamping a couple of array operators to use it. There's a lot yet
to do, but I've done enough to get some preliminary ideas about
performance (see below).
Nice!
* Although I said above that everything owned by a deserialized object
has to live in a single memory context, I do have ideas about relaxing
that. The core idea would be to invent a "memory context reset/delete
callback" feature in mcxt.c. Then a deserialized object could register
such a callback on its own memory context, and use the callback to clean
up resources outside its context. This is potentially useful for instance
for something like PostGIS, where an object likely includes some data that
was allocated with malloc not palloc because it was created by library
functions that aren't Postgres-aware. Another likely use-case is for
deserialized objects representing composite types to maintain reference
counts on their tuple descriptors instead of having to copy said
descriptors into their private contexts. This'd be material for a
separate patch though.
Being able to register a callback to be used on deletion of the context
would certainly be very nice and strikes me as pretty independent of the
rest of this. You've probably thought of this already, but registering
the callback should probably allow the caller to pass in a pointer to be
passed back to the callback function when the delete happens, so that
there's a place for the metadata to be stored about what the callback
function needs to clean up when it's called.
So that's the plan, and attached is a very-much-WIP patch that uses this
approach to speed up plpgsql array element assignments (and not a whole
lot else as yet). Here's the basic test case I've been using:
I've not looked at the code at all as yet, but it makes sense to me.
With the attached patch, those timings drop to 80 and 150 ms respectively.
And those numbers are pretty fantastic and would address an area we
regularly get dinged on.
Still, if the worst-case slowdown is around 20% on trivially-sized arrays,
I'd gladly take that to have better performance on larger arrays. And I
think this example is close to the worst case for the patch's approach,
since it's testing small, fixed-element-length, no-nulls arrays, which is
what the existing code can handle without spending a lot of cycles.
Agreed.
BTW, I'm not all that thrilled with the "deserialized object" terminology.
I found myself repeatedly tripping up on which form was serialized and
which de-. If anyone's got a better naming idea I'm willing to adopt it.
Unfortunately, nothing comes to mind. Serialization is, at least, a
pretty well understood concept and so the naming will likely make sense
to newcomers, even if it's difficult to keep track of which is
serialized and which is deserialized.
I'm not sure exactly how to push this forward. I would not want to
commit it without converting a significant number of array functions to
understand about deserialized inputs, and by the time I've finished that
work it's likely to be too late for 9.5. OTOH I'm sure that the PostGIS
folk would love to have this infrastructure in 9.5 not 9.6 so they could
make a start on fixing their issues. (Further down the pike, I'd plan to
look at adapting composite-type operations, JSONB, etc, to make use of
this approach, but that certainly isn't happening for 9.5.)Thoughts, advice, better ideas?
I'm not really a big fan of putting an infrastructure out there for
modules to use that we don't use ourselves (particularly when it's clear
that there are places where we could/should be). On the other hand,
this doesn't impact on-disk format and therefore I'm less worried that
we'll end up with a release-critical issue when we're getting ready to
put 9.5 out there.
So, I'm on the fence about it. I'd love to see all of this in 9.5 with
the array functions converted, but I don't think it'd be horrible if
only a subset had been done in time for 9.5. The others aren't going to
go anywhere and will still work. I do think it'd be better to have at
least some core users of this new infrastructure rather than just
putting it out there for modules to use but I agree it'd be a bit grotty
to have only some of the array functions converted.
Thanks!
Stephen
[ this is addressing a tangential point ... ]
Stephen Frost <sfrost@snowman.net> writes:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
* Although I said above that everything owned by a deserialized object
has to live in a single memory context, I do have ideas about relaxing
that. The core idea would be to invent a "memory context reset/delete
callback" feature in mcxt.c. Then a deserialized object could register
such a callback on its own memory context, and use the callback to clean
up resources outside its context.
Being able to register a callback to be used on deletion of the context
would certainly be very nice and strikes me as pretty independent of the
rest of this. You've probably thought of this already, but registering
the callback should probably allow the caller to pass in a pointer to be
passed back to the callback function when the delete happens, so that
there's a place for the metadata to be stored about what the callback
function needs to clean up when it's called.
Yeah, there would likely be use-cases for that outside of deserialized
objects. I could submit a separate patch for that now, but I'm hesitant
to add a mechanism without any use-case in the same patch. But maybe we
could find a caller somewhere in the core aggregate code --- there are
some aggregates that need cleanup callbacks already, IIRC, and maybe we
could change them to use a memory context callback instead of whatever
they're doing now.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Without having read the patch, I think this is great. I've been wishing
for something like this while working on my variant data type.
Are there any cases where we would want to use this on a non-variant?
Perhaps types where we're paying an alignment penalty?
On 2/10/15 3:00 PM, Stephen Frost wrote:
BTW, I'm not all that thrilled with the "deserialized object" terminology.
I found myself repeatedly tripping up on which form was serialized and
which de-. If anyone's got a better naming idea I'm willing to adopt it.Unfortunately, nothing comes to mind. Serialization is, at least, a
pretty well understood concept and so the naming will likely make sense
to newcomers, even if it's difficult to keep track of which is
serialized and which is deserialized.
Apologies if I'm just dense, but what's the confusion? Is it what a
serialized datum means in this context? (de)serialized seems like a
perfectly logical name to me...
I'm not sure exactly how to push this forward. I would not want to
commit it without converting a significant number of array functions to
understand about deserialized inputs, and by the time I've finished that
work it's likely to be too late for 9.5. OTOH I'm sure that the PostGIS
folk would love to have this infrastructure in 9.5 not 9.6 so they could
make a start on fixing their issues. (Further down the pike, I'd plan to
look at adapting composite-type operations, JSONB, etc, to make use of
this approach, but that certainly isn't happening for 9.5.)Thoughts, advice, better ideas?
I'm not really a big fan of putting an infrastructure out there for
modules to use that we don't use ourselves (particularly when it's clear
that there are places where we could/should be). On the other hand,
this doesn't impact on-disk format and therefore I'm less worried that
we'll end up with a release-critical issue when we're getting ready to
put 9.5 out there.So, I'm on the fence about it. I'd love to see all of this in 9.5 with
the array functions converted, but I don't think it'd be horrible if
only a subset had been done in time for 9.5. The others aren't going to
go anywhere and will still work. I do think it'd be better to have at
least some core users of this new infrastructure rather than just
putting it out there for modules to use but I agree it'd be a bit grotty
to have only some of the array functions converted.
I think the solution here is to have people other than Tom do the
gruntwork of applying this to the remaining array code. My thought is
that if Tom shows how to do this correctly in a rather complex case (ie,
where you need to worry about primary vs secondary), then less
experienced hackers should be able to take the ball and run with it.
Maybe we won't get complete array coverage, but I think any performance
gains here are a win. And really, don't we just need enough usage so the
buildfarm will tell us if we accidentally break something?
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Jim Nasby <Jim.Nasby@BlueTreble.com> writes:
Without having read the patch, I think this is great. I've been wishing
for something like this while working on my variant data type.
Are there any cases where we would want to use this on a non-variant?
Perhaps types where we're paying an alignment penalty?
What do you mean by non-variant?
The use cases that have come to mind for me are:
* arrays, of course
* composite types (records)
* PostGIS geometry type
* JSONB, hstore
* possibly regex patterns (we could invent a data type representing these
and then store the compiled form as a deserialized representation;
although there would be some issues to be worked out to get any actual
win, probably)
The principal thing that's a bit hard to figure out is when it's a win to
convert to a deserialized representation and when you should just leave
well enough alone. I'm planning to investigate that further in the
context of plpgsql array variables, but I'm not sure how well those
answers will carry over to datatypes that plpgsql has no intrinsic
understanding of.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2/10/15 5:19 PM, Tom Lane wrote:
Jim Nasby <Jim.Nasby@BlueTreble.com> writes:
Without having read the patch, I think this is great. I've been wishing
for something like this while working on my variant data type.Are there any cases where we would want to use this on a non-variant?
Perhaps types where we're paying an alignment penalty?What do you mean by non-variant?
Ugh, sorry, brainfart. I meant to say non-varlena.
I can't think of any non-varlena types we'd want this for, but maybe
someone else can think of a case. If there is a use-case I wouldn't
handle it with this patch, but we'd want to consider it...
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Jim Nasby <Jim.Nasby@BlueTreble.com> writes:
On 2/10/15 5:19 PM, Tom Lane wrote:
What do you mean by non-variant?
Ugh, sorry, brainfart. I meant to say non-varlena.
I can't think of any non-varlena types we'd want this for, but maybe
someone else can think of a case. If there is a use-case I wouldn't
handle it with this patch, but we'd want to consider it...
There isn't any practical way to interpose TOAST pointers for non-varlena
types, since we make no assumptions about the bit contents of fixed-length
types. But I'm having a hard time thinking of a fixed-length type in
which you'd have any need for a deserialized representation, either.
I think restricting this feature to varlena types is just fine.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
I think restricting this feature to varlena types is just fine.
Agreed.
Thanks,
Stephen
On Tue, Feb 10, 2015 at 3:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I've now taken this idea as far as building the required infrastructure
and revamping a couple of array operators to use it. There's a lot yet
to do, but I've done enough to get some preliminary ideas about
performance (see below).
Very impressive. This is something that's been mentioned before and
which I always thought would be great to have, but I didn't expect it
would be this easy to cobble together a working implementation. Or
maybe "easy" isn't the right term, but this isn't a very big patch.
BTW, I'm not all that thrilled with the "deserialized object" terminology.
I found myself repeatedly tripping up on which form was serialized and
which de-. If anyone's got a better naming idea I'm willing to adopt it.
My first thought is that we should form some kind of TOAST-like
backronym, like Serialization Avoidance Loading and Access Device
(SALAD) or Break-up, Read, Edit, Assemble, and Deposit (BREAD). I
don't think there is anything per se wrong with the terms
serialization and deserialization; indeed, I used the same ones in the
parallel-mode stuff. But they are fairly general terms, so it might
be nice to have something more specific that applies just to this
particular usage.
I found the notion of "primary" and "secondary" TOAST pointers to be
quite confusing. I *think* what you are doing is storing two pointers
to the object in the object, and a pointer to the object is really a
pointer to one of those two pointers to the object. Depending on
which one it is, you can write the object, or not. This is a clever
representation, but it's hard to wrap your head around, and I'm not
sure "primary" and "secondary" are the best names, although I don't
have an idea as to what would be better. I'm a bit confused, though:
once you give out a secondary pointer, how is it safe to write the
object through the primary pointer?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
On Tue, Feb 10, 2015 at 3:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
BTW, I'm not all that thrilled with the "deserialized object" terminology.
I found myself repeatedly tripping up on which form was serialized and
which de-. If anyone's got a better naming idea I'm willing to adopt it.
My first thought is that we should form some kind of TOAST-like
backronym, like Serialization Avoidance Loading and Access Device
(SALAD) or Break-up, Read, Edit, Assemble, and Deposit (BREAD). I
don't think there is anything per se wrong with the terms
serialization and deserialization; indeed, I used the same ones in the
parallel-mode stuff. But they are fairly general terms, so it might
be nice to have something more specific that applies just to this
particular usage.
Hm. I'm not against the concept, but those particular suggestions don't
grab me.
I found the notion of "primary" and "secondary" TOAST pointers to be
quite confusing. I *think* what you are doing is storing two pointers
to the object in the object, and a pointer to the object is really a
pointer to one of those two pointers to the object. Depending on
which one it is, you can write the object, or not.
There's more to it than that. (Writing more docs is one of the to-do
items ;-).) We could alternatively have done that with two different
va_tag values for "read write" and "read only", which indeed was my
initial intention before I thought of this dodge. However, then you
have to figure out where to store such pointers, which is problematic
both for plpgsql variable assignment and for ExecMakeSlotContentsReadOnly,
especially the latter which would have to put any freshly-made pointer
in a long-lived context resulting in query-lifespan memory leaks.
So I early decided that the read-write pointer should live right in the
object's own context where it need not be copied when swinging the
context ownership someplace else, and later realized that there should
also be a permanent read-only pointer in there for the use of
ExecMakeSlotContentsReadOnly, and then realized that they didn't need
to have different va_tag values if we implemented the "is read-write
pointer" test as it's done in the patch. Having only one va_tag value
not two saves cycles, I think, because there are a lot of low-level
tests that don't need to distinguish, eg VARTAG_SIZE(). However it
does make it more expensive when you do need to distinguish, so I might
reconsider that decision later. (Since these will never go to disk,
we can whack the representation around pretty freely if needed.)
Also, I have hopes of allowing deserialized-object pointers to be copied
into tuples as pointers rather than by reserialization, if we can
establish that the tuple is short-lived enough that the pointer will stay
good, which would be true in a lot of cases during execution of queries by
plpgsql. With the patch's design, a pointer so copied will automatically
be considered read-only, which I *think* is the behavior we'd need. If it
turns out that it's okay to propagate read-write-ness through such a copy
step then that would argue in favor of using two va_tag values.
It may be that this solution is overly cute and we should just use two
tag values. But I wanted to be sure it was possible for copying of a
pointer to automatically lose read-write-ness, in case we need to have
such a guarantee.
This is a clever
representation, but it's hard to wrap your head around, and I'm not
sure "primary" and "secondary" are the best names, although I don't
have an idea as to what would be better. I'm a bit confused, though:
once you give out a secondary pointer, how is it safe to write the
object through the primary pointer?
It's no different from allowing plpgsql to update the values of variables
of pass-by-reference types even though it has previously given out Datums
that are pointers to them: by the time we're ready to execute an
assignment, any query execution that had such a pointer is over and done
with. (This implies that cursor parameters have to be physically copied
into the cursor's execution state, which is one of a depressingly large
number of reasons why datumCopy() has to physically copy a deserialized
value rather than just copying the pointer. But otherwise it works.)
There is more work to do to figure out how we can safely give out a
read/write pointer for cases like
hstore_var := hstore_concat(hstore_var, ...);
Aside from the question of whether hstore_concat guarantees not to trash
the value on failure, we'd have to restrict this (I think) to expressions
in which there is only one reference to the target variable and it's an
argument of the topmost function/operator. But that's something I've not
tried to implement yet.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Feb 12, 2015 at 9:50 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
My first thought is that we should form some kind of TOAST-like
backronym, like Serialization Avoidance Loading and Access Device
(SALAD) or Break-up, Read, Edit, Assemble, and Deposit (BREAD). I
don't think there is anything per se wrong with the terms
serialization and deserialization; indeed, I used the same ones in the
parallel-mode stuff. But they are fairly general terms, so it might
be nice to have something more specific that applies just to this
particular usage.Hm. I'm not against the concept, but those particular suggestions don't
grab me.
Fair enough. I guess the core of my point is just that I suggest we
invent a name for this thing. "Serialize" and "deserialize" describe
what you are doing just fine, but the mechanism itself should be
called something, I think. When you say "varlena header" or "TOAST
pointer" that is a name for a very particular thing, not just a
general category of things you might do. If we replaced every
instance of "TOAST pointer" to "reference to where the full value is
stored", it would be much less clear, and naming all of the related
functions would be harder.
This is a clever
representation, but it's hard to wrap your head around, and I'm not
sure "primary" and "secondary" are the best names, although I don't
have an idea as to what would be better. I'm a bit confused, though:
once you give out a secondary pointer, how is it safe to write the
object through the primary pointer?It's no different from allowing plpgsql to update the values of variables
of pass-by-reference types even though it has previously given out Datums
that are pointers to them: by the time we're ready to execute an
assignment, any query execution that had such a pointer is over and done
with. (This implies that cursor parameters have to be physically copied
into the cursor's execution state, which is one of a depressingly large
number of reasons why datumCopy() has to physically copy a deserialized
value rather than just copying the pointer. But otherwise it works.)
OK, I see. So giving out a secondary pointer doesn't necessarily
preclude further changes via the primary pointer, but you'd better be
sure that you don't try until such time as all of those secondary
references are gone.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Feb 12, 2015 at 08:52:56AM -0500, Robert Haas wrote:
BTW, I'm not all that thrilled with the "deserialized object" terminology.
I found myself repeatedly tripping up on which form was serialized and
which de-. If anyone's got a better naming idea I'm willing to adopt it.My first thought is that we should form some kind of TOAST-like
backronym, like Serialization Avoidance Loading and Access Device
(SALAD) or Break-up, Read, Edit, Assemble, and Deposit (BREAD). I
don't think there is anything per se wrong with the terms
serialization and deserialization; indeed, I used the same ones in the
parallel-mode stuff. But they are fairly general terms, so it might
be nice to have something more specific that applies just to this
particular usage.
The words that sprung to mind for me were: packed/unpacked.
Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
He who writes carelessly confesses thereby at the very outset that he does
not attach much importance to his own thoughts.
-- Arthur Schopenhauer
On 2/13/15 2:04 AM, Martijn van Oosterhout wrote:
On Thu, Feb 12, 2015 at 08:52:56AM -0500, Robert Haas wrote:
BTW, I'm not all that thrilled with the "deserialized object" terminology.
I found myself repeatedly tripping up on which form was serialized and
which de-. If anyone's got a better naming idea I'm willing to adopt it.My first thought is that we should form some kind of TOAST-like
backronym, like Serialization Avoidance Loading and Access Device
(SALAD) or Break-up, Read, Edit, Assemble, and Deposit (BREAD). I
don't think there is anything per se wrong with the terms
serialization and deserialization; indeed, I used the same ones in the
parallel-mode stuff. But they are fairly general terms, so it might
be nice to have something more specific that applies just to this
particular usage.The words that sprung to mind for me were: packed/unpacked.
+1
After thinking about it, I don't think having a more distinctive name
(like TOAST) is necessary for this feature. TOAST is something that's
rather visible to end users, whereas packing would only matter to
someone creating a new varlena type.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Martijn van Oosterhout <kleptog@svana.org> writes:
On Thu, Feb 12, 2015 at 08:52:56AM -0500, Robert Haas wrote:
BTW, I'm not all that thrilled with the "deserialized object" terminology.
I found myself repeatedly tripping up on which form was serialized and
which de-. If anyone's got a better naming idea I'm willing to adopt it.
My first thought is that we should form some kind of TOAST-like
backronym, like Serialization Avoidance Loading and Access Device
(SALAD) or Break-up, Read, Edit, Assemble, and Deposit (BREAD). I
don't think there is anything per se wrong with the terms
serialization and deserialization; indeed, I used the same ones in the
parallel-mode stuff. But they are fairly general terms, so it might
be nice to have something more specific that applies just to this
particular usage.
The words that sprung to mind for me were: packed/unpacked.
Trouble is that we're already using "packed" with a specific connotation
in that same area of the code, namely for short-header varlena values.
(See pg_detoast_datum_packed() etc.) So I don't think this will work.
But maybe a different adjective?
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, Feb 14, 2015 at 10:45 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Martijn van Oosterhout <kleptog@svana.org> writes:
On Thu, Feb 12, 2015 at 08:52:56AM -0500, Robert Haas wrote:
BTW, I'm not all that thrilled with the "deserialized object" terminology.
I found myself repeatedly tripping up on which form was serialized and
which de-. If anyone's got a better naming idea I'm willing to adopt it.My first thought is that we should form some kind of TOAST-like
backronym, like Serialization Avoidance Loading and Access Device
(SALAD) or Break-up, Read, Edit, Assemble, and Deposit (BREAD). I
don't think there is anything per se wrong with the terms
serialization and deserialization; indeed, I used the same ones in the
parallel-mode stuff. But they are fairly general terms, so it might
be nice to have something more specific that applies just to this
particular usage.The words that sprung to mind for me were: packed/unpacked.
Trouble is that we're already using "packed" with a specific connotation
in that same area of the code, namely for short-header varlena values.
(See pg_detoast_datum_packed() etc.) So I don't think this will work.
But maybe a different adjective?
expanded?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
On Sat, Feb 14, 2015 at 10:45 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Martijn van Oosterhout <kleptog@svana.org> writes:
The words that sprung to mind for me were: packed/unpacked.
Trouble is that we're already using "packed" with a specific connotation
in that same area of the code, namely for short-header varlena values.
(See pg_detoast_datum_packed() etc.) So I don't think this will work.
But maybe a different adjective?
expanded?
That seems to work from the standpoint of not conflicting with other
nearby usages in our code, and it's got the right semantics I think.
Any other suggestions out there? Otherwise I'll probably go with this.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Here's an updated version of the patch I sent before. Notable changes:
* I switched over to calling "deserialized" objects "expanded" objects,
and the default representation is now called "flat" or "flattened" instead
of "reserialized". Per suggestion from Robert.
* I got rid of the bit about detecting read-write pointers by address
comparison. Instead there are now two vartag values for R/W and R/O
pointers. After further reflection I concluded that my previous worry
about wanting copied pointers to automatically become read-only was
probably wrong, so there's no need for extra confusion here.
* I added support for extracting array elements from expanded values
(array_ref).
* I hacked plpgsql to force values of array-type variables into expanded
form; this is needed to get any win from the array_ref change if the
function doesn't do any assignments to elements of the array. This is an
improvement over the original patch, which hardwired array_set to force
expansion, but I remain unsatisfied with it as a long-term answer. It's
not clear that it's always a win to do this (but the tradeoff will change
as we convert more array support functions to handle expanded inputs, so
it's probably not worth getting too excited about that aspect of it yet).
A bigger complaint is that this approach cannot fix things for non-builtin
types such as hstore. I'm hesitant to add a pg_type column carrying an
expansion function OID, but there may be no other workable answer for
extension types.
The patch as it stands is able to do nice things with
create or replace function arraysetnum(n int) returns numeric[] as $$
declare res numeric[] := '{}';
begin
for i in 1 .. n loop
res[i] := i;
end loop;
return res;
end
$$ language plpgsql strict;
create or replace function arraysumnum(arr numeric[]) returns numeric as $$
declare res numeric := 0;
begin
for i in array_lower(arr, 1) .. array_upper(arr, 1) loop
res := res + arr[i];
end loop;
return res;
end
$$ language plpgsql strict;
regression=# select arraysumnum(arraysetnum(100000));
arraysumnum
-------------
5000050000
(1 row)
Time: 304.336 ms
(versus approximately 1 minute in 9.4, although these numbers are for
cassert builds so should be taken with a grain of salt.) There are
still a couple more flattening/expansion conversions than I'd like,
in particular the array returned by arraysetnum() gets flattened on its
way out, which would be good to avoid.
I'm going to stick this into the commitfest even though it's not really
close to being committable; I see some other people doing likewise with
their pet patches ;-). What it could particularly do with some reviewing
help on is exploring the performance changes it creates; what cases does
it make substantially worse?
regards, tom lane
Attachments:
expanded-arrays-0.2.patchtext/x-diff; charset=us-ascii; name=expanded-arrays-0.2.patchDownload+1852-202
On Sun, Feb 15, 2015 at 6:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I'm going to stick this into the commitfest even though it's not really
close to being committable; I see some other people doing likewise with
their pet patches ;-). What it could particularly do with some reviewing
help on is exploring the performance changes it creates; what cases does
it make substantially worse?
It's perfectly reasonable to add stuff that isn't committable yet to
the CF app; the point of the CF app is to track what needs reviewing.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Attached is an 0.3 version, rebased over today's HEAD changes (applies to
commit 9e3ad1aac52454569393a947c06be0d301749362 or later), and with some
better logic for transferring expanded array values into and out of plpgsql
functions. Using this example:
create or replace function arraysetnum(n int) returns numeric[] as $$
declare res numeric[] := '{}';
begin
for i in 1 .. n loop
res[i] := i;
end loop;
return res;
end
$$ language plpgsql strict;
create or replace function arraysumnum(arr numeric[]) returns numeric as $$
declare res numeric := 0;
begin
for i in array_lower(arr, 1) .. array_upper(arr, 1) loop
res := res + arr[i];
end loop;
return res;
end
$$ language plpgsql strict;
create or replace function arraytimenum(n int) returns numeric as $$
declare tmp numeric[];
begin
tmp := arraysetnum(n);
return arraysumnum(tmp);
end
$$ language plpgsql strict;
either of the test cases
select arraysumnum(arraysetnum(100000));
select arraytimenum(100000);
involve exactly one coercion from flat to expanded array (during the
initial assignment of the '{}' constant to the "res" variable), no
coercions from expanded to flat, and no bulk copy operations.
So I'm starting to feel fairly good about this. Obviously there's a
nontrivial amount of work to do with integrating the array-code changes
and teaching the rest of the array functions about expanded arrays (or
at least as many of them as seem performance-critical). But that looks
like just a few days of basically-mechanical effort. A larger question
is what we ought to do about extending the array-favoring hacks in plpgsql
to support this type of optimization for non-built-in types.
Realize that what this patch is able to improve are basically two types
of cases:
* nests of function calls that take and return the same complex datatype,
think foo(bar(baz(x))), where x is stored in some flat format but foo()
bar() and baz() all agree on an expanded format that's easier to process.
* plpgsql variables stored in an expanded format that's easier to process
for most functions that might work with their values.
The first case can be implemented by mutual agreement among the functions
of the datatype; it does not need any additional help beyond what's in
this patch. But the second case does not work very well unless plpgsql
takes some proactive step to force variable values into the expanded
format. Otherwise you get a win only if the last assignment to the
variable happened to come from a source that supplied a read-write
expanded value. You can make that happen with appropriate coding in
the plpgsql function, of course, but it's klugy to have to do that.
I would not be ashamed to ship this in 9.5 as just an array optimization
and leave the larger question for next time ... but it does feel a bit
unfinished like this. OTOH, I'm not sure whether the PostGIS folk care
all that much about the intermediate-values-in-plpgsql-variables
scenario. They didn't bring it up in the discussion a year or so back
about their requirements. We do know very well that plpgsql array
variables are a performance pain point, so maybe fixing that is enough
of a goal for 9.5.
(BTW, the nested-function-calls case sure seems like it's dead center
of the wheelhouse for JSONB. Just sayin'. I do not myself have time
to think about applying this technology to JSONB right now, but does
anyone else want to step up?)
regards, tom lane
Attachments:
expanded-arrays-0.3.patchtext/x-diff; charset=us-ascii; name=expanded-arrays-0.3.patchDownload+1941-206
Here's an 0.4 version, in which I've written some user docs, refactored
the array-specific code into a more reasonable arrangement, and adjusted
a lot of the built-in array functions to support expanded arrays directly.
This is about as far as I feel a need to take the latter activity, at
least for now; there are a few remaining operations that might be worth
converting but it's not clear they'd really offer much benefit.
I think this is actually now a serious candidate to be committed as-is,
not just a prototype. What we lack though is a clear understanding of
the performance characteristics.
regards, tom lane