hooks for supporting third party blobs?

Started by Eric Daviesover 21 years ago3 messagesgeneral
Jump to latest
#1Eric Davies
eric@barrodale.com

A recent project of ours involved storing/fetching some reasonably large
datasets in a home-brew datatype. The datasets tended to range from a few
megabytes, to several gigabytes. We were seeing some nonlinear slowness
with using native large objects with larger datasets, presumably due to the
increasing depth of the btree index used to track all the little pieces of
the blobs.

After some careful consideration, we implemented an alternative to large
objects, a system based on storing files in a particular directory, and
storing a reference to the files in the database. It worked and gave us
good and consistent performance. However, it doesn't support transactions
(no isolation, no rollback). We can probably implement some backend code to
support such functionality, but the trick is getting the postgres server to
keep our code in the loop (so to speak) about when a rollback should be
done (and to when).

Is anyone aware of any hooks to support schemes such as ours, or has solved
a similar problem?

Thank you.

**********************************************
Eric Davies, M.Sc.
Barrodale Computing Services Ltd.
Tel: (250) 472-4372 Fax: (250) 472-4373
Web: http://www.barrodale.com
Email: eric@barrodale.com
**********************************************
Mailing Address:
P.O. Box 3075 STN CSC
Victoria BC Canada V8W 3W2

Shipping Address:
Hut R, McKenzie Avenue
University of Victoria
Victoria BC Canada V8W 3W2
**********************************************

#2Alvaro Herrera
alvherre@dcc.uchile.cl
In reply to: Eric Davies (#1)
Re: hooks for supporting third party blobs?

On Mon, Dec 06, 2004 at 05:11:21PM -0800, Eric Davies wrote:

Is anyone aware of any hooks to support schemes such as ours, or has solved
a similar problem?

There's RegisterXactCallback() and RegisterSubXactCallback() functions
that may be what you want. They are called whenever a transaction or
subtransaction starts, commits, or aborts. You could probably keep a
list of things modified during the transaction, so you can clean up at
transaction end.

(Much like the storage manager does: it only unlinks files for dropped
tables at transaction commit.)

Make sure to react appropiately at subtransaction abort ...

--
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
"Si quieres ser creativo, aprende el arte de perder el tiempo"

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Eric Davies (#1)
Re: hooks for supporting third party blobs?

Eric Davies <Eric@barrodale.com> writes:

A recent project of ours involved storing/fetching some reasonably large
datasets in a home-brew datatype. The datasets tended to range from a few
megabytes, to several gigabytes. We were seeing some nonlinear slowness
with using native large objects with larger datasets, presumably due to the
increasing depth of the btree index used to track all the little pieces of
the blobs.

Did you do any profiling to back up that "presumably"? It seems at
least as likely to me that this was caused by some easily-fixed
inefficiency somewhere. There are still a lot of O(N^2) algorithms
in the backend that no one has run up against yet ...

regards, tom lane