Question about MemoryContextRegisterResetCallback

Started by Michel Pelletierover 7 years ago5 messagesgeneral

pelletier.michel@gmail.com

over 7 years ago

Hello,

I'm working on an extension to wrap the GraphBLAS linear algebra package.
GraphBLAS provides a very flexible API over adjacency matrices for solving
graph problems. I've got Matrix and Vector types wrapped, build
aggregators and extraction functions to pivot tables into matrices and
back, and may of the core operations are supported for just one of the 960
different semirings that GraphBLAS supports, but i'm making good progress
and with some advanced macro'ing I hope to provide complete API access.

This is no doubt the most complex bit of C wrapper I've done for postgres,
and I've run into a bit of a snag. GraphBLAS objects are opaque handles
that have their own new/free functions. After reading mmgr/README I have
registered a callback with CurrentMemoryContext during my aggregator
function that builds values.

https://github.com/michelp/pggraphblas/blob/master/src/matrix.c#L80

I've got tests that work very well, up until I declare a matrix or vector
in a plpgsql function.

https://github.com/michelp/pggraphblas/blob/master/test.sql#L103

When using these objects from a function, their free function seems be be
called prematurely, as GraphBLAS raises an error that the object isn't
initialized when it tries to compare two matrices with 'matrix_eq' (the
free function "uninitializes" a handle). If I use CurTransactionContext
instead of CurrentMemoryContext, the function doesn't fail, but the server
segfaults on rollback.

For the brave and curious the test can reproduce the error, if you have
docker installed, just clone the repo and run './test.sh'. (The first
build takes a while due to compiling GraphBLAS). Here's an example failure:

https://gist.github.com/michelp/1ba3cc79996b8d3a963d974224a78f2d

Obviously there is something I'm doing wrong about these callbacks,
thinking my free function is getting called immediately after the statement
that creates it, so I'm not sure what context to register it under. Should
I create a new one? Register it to the CurrentMemoryContext parent maybe?
Any help from the gurus on this would be greatly appreciated!

Thanks,

-Michel

Tom Lane

tgl@sss.pgh.pa.us

over 7 years ago

In reply to: Michel Pelletier (#1)

Re: Question about MemoryContextRegisterResetCallback

Michel Pelletier <pelletier.michel@gmail.com> writes:

This is no doubt the most complex bit of C wrapper I've done for postgres,
and I've run into a bit of a snag. GraphBLAS objects are opaque handles
that have their own new/free functions. After reading mmgr/README I have
registered a callback with CurrentMemoryContext during my aggregator
function that builds values.

I suppose what you're doing is returning a pointer to a GraphBLAS object
as a Datum (or part of a pass-by-ref Datum)? If so, that's not going
to work terribly well, because it ignores the problem that datatype-
independent code is going to assume it can copy Datum values using
datumCopy() or equivalent logic. More often than not, such copying
is done to move the value into a different memory context in preparation
for freeing the original context. If you delete the GraphBLAS object
when the original context is deleted, you now have a dangling pointer
in the copy.

We did invent some infrastructure awhile ago that could potentially
handle this sort of situation: it's the "expanded datum" stuff.
The idea here would be that your representation involving a GraphBLAS
pointer would be an efficient-to-operate-on expanded object. You
would need to be able to serialize and deserialize that representation
into plain self-contained Datums (probably varlena blobs), but hopefully
GraphBLAS is capable of going along with that. You'd still need a
memory context reset callback attached to each expanded object, to
free the associated GraphBLAS object --- but expanded objects are
explicitly aware of which context they're in, so at least in principle
that should work. (I'm not sure anyone's actually tried to build
an expanded-object representation that has external resources, so
we might find there are some bugs to fix there.)

Take a look at

src/include/utils/expandeddatum.h
src/backend/utils/adt/expandeddatum.c
src/backend/utils/adt/array_expanded.c
src/backend/utils/adt/expandedrecord.c

regards, tom lane

Michel Pelletier

pelletier.michel@gmail.com

over 7 years ago

In reply to: Tom Lane (#2)

Re: Question about MemoryContextRegisterResetCallback

On Sun, Jan 13, 2019 at 9:30 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I suppose what you're doing is returning a pointer to a GraphBLAS object
as a Datum (or part of a pass-by-ref Datum)? If so, that's not going
to work terribly well, because it ignores the problem that datatype-
independent code is going to assume it can copy Datum values using
datumCopy() or equivalent logic. More often than not, such copying
is done to move the value into a different memory context in preparation
for freeing the original context. If you delete the GraphBLAS object
when the original context is deleted, you now have a dangling pointer
in the copy.

We did invent some infrastructure awhile ago that could potentially
handle this sort of situation: it's the "expanded datum" stuff.
The idea here would be that your representation involving a GraphBLAS
pointer would be an efficient-to-operate-on expanded object. You
would need to be able to serialize and deserialize that representation
into plain self-contained Datums (probably varlena blobs), but hopefully
GraphBLAS is capable of going along with that. You'd still need a
memory context reset callback attached to each expanded object, to
free the associated GraphBLAS object --- but expanded objects are
explicitly aware of which context they're in, so at least in principle
that should work. (I'm not sure anyone's actually tried to build
an expanded-object representation that has external resources, so
we might find there are some bugs to fix there.)

Take a look at

src/include/utils/expandeddatum.h
src/backend/utils/adt/expandeddatum.c
src/backend/utils/adt/array_expanded.c
src/backend/utils/adt/expandedrecord.c

Ah I see, the water is much deeper here. Thanks for the detailed
explanation, expandeddatum.h was very helpful and I see now how
array_expanded works. If I run into any problems registering my callback
in the expanded context I'll repost back.

Thanks Tom!

-Michel

Show quoted text

regards, tom lane

Michel Pelletier

pelletier.michel@gmail.com

over 7 years ago

In reply to: Michel Pelletier (#3)

Re: Question about MemoryContextRegisterResetCallback

After absorbing some of the code you've pointed out I have a couple of
questions about my understanding before I start hacking on making expanded
matrices.

Serializing sparse matrices can be done with _expand/_build functions, and
the size is known, so I can implement the EOM_flatten_into_methods. From
the array examples, it looks like accessor functions are responsible for
detecting and unflattening themselves, so I think I've got that understood.

Reading expandeddatum.h says "The format appearing on disk is called the
data type's "flattened" representation. since it is required to be a
contiguous blob of bytes -- but the type can have an expanded
representation that is not. Data types must provide means to translate an
expanded representation back to flattened form."

It mentions "on disk" does this mean the flattened representation must be
binary compatible with what matrix_send emits? They will likely be the
same now, so I can see this as a convenience, but is it a requirement?
Future matrix_send implementations may do some form of compressed sparse
row format, which would be inefficient for in-memory copies.

Thanks again,

-Michel

On Sun, Jan 13, 2019 at 10:51 AM Michel Pelletier <
pelletier.michel@gmail.com> wrote:

Show quoted text

On Sun, Jan 13, 2019 at 9:30 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I suppose what you're doing is returning a pointer to a GraphBLAS object
as a Datum (or part of a pass-by-ref Datum)? If so, that's not going
to work terribly well, because it ignores the problem that datatype-
independent code is going to assume it can copy Datum values using
datumCopy() or equivalent logic. More often than not, such copying
is done to move the value into a different memory context in preparation
for freeing the original context. If you delete the GraphBLAS object
when the original context is deleted, you now have a dangling pointer
in the copy.

We did invent some infrastructure awhile ago that could potentially
handle this sort of situation: it's the "expanded datum" stuff.
The idea here would be that your representation involving a GraphBLAS
pointer would be an efficient-to-operate-on expanded object. You
would need to be able to serialize and deserialize that representation
into plain self-contained Datums (probably varlena blobs), but hopefully
GraphBLAS is capable of going along with that. You'd still need a
memory context reset callback attached to each expanded object, to
free the associated GraphBLAS object --- but expanded objects are
explicitly aware of which context they're in, so at least in principle
that should work. (I'm not sure anyone's actually tried to build
an expanded-object representation that has external resources, so
we might find there are some bugs to fix there.)

Take a look at

src/include/utils/expandeddatum.h
src/backend/utils/adt/expandeddatum.c
src/backend/utils/adt/array_expanded.c
src/backend/utils/adt/expandedrecord.c

Ah I see, the water is much deeper here. Thanks for the detailed
explanation, expandeddatum.h was very helpful and I see now how
array_expanded works. If I run into any problems registering my callback
in the expanded context I'll repost back.

Thanks Tom!

-Michel

regards, tom lane

Tom Lane

tgl@sss.pgh.pa.us

over 7 years ago

In reply to: Michel Pelletier (#4)

Re: Question about MemoryContextRegisterResetCallback

Michel Pelletier <pelletier.michel@gmail.com> writes:

It mentions "on disk" does this mean the flattened representation must be
binary compatible with what matrix_send emits? They will likely be the
same now, so I can see this as a convenience, but is it a requirement?

No, your on-the-wire representation for send/recv can be different from
what you put on-disk. In fact, typically I'd recommend that it *should*
be different to some extent, to insulate COPY BINARY data from internal
representation choices and simplify any client code that might look at
the "binary" format. See for example numeric_send/recv, which don't
just transmit the internal format as-is --- and that was a good thing
because it let us implement various storage optimizations over the years
without breaking COPY BINARY compatibility. It's also a good idea to
try to make the on-the-wire representation independent of server
endianness and alignment rules.

The point of the comment you're looking at is that the "flat" varlena
representation that you have to translate to/from is the same as what
will be on-disk if the datum gets stored someplace.

regards, tom lane