[RFC] Comments on PostPic project - Repost

Started by Domenico Rotirotialmost 16 years ago5 messages

drotiro@tiscali.it

almost 16 years ago

I had no answer from pgsql-general, so I'm reposting here.

Hello,
I would like to receive comments/suggestions about this project:
http://github.com/drotiro/postpic.

In short, it's an extension that enables image processing within the
database, adding a new type (image) and several functions.
The SQL and Java interfaces are documented on the project's wiki, so I'm not
talking about these here, but instead present some detail on the datatype's
implementation.

The image is represented by a struct containing some attributes (dimensions,
some exif tag: shoot date, exposure time...) and a large object holding the
actual image data.
The idea is to have attributes stored directly to allow for efficient
searching, while the large object seemed a reasonable choice to store the
possibly large image data (what are the LOBs for?).
With the current large objects implementation, when a new lo is created it
"lives" in the pg_largeobjects table, until someone calls lo_unlink on it.
In my case: I create the lo on behalf of the user, then store its oid in the
image's internal representation. At this point, the image can be inserted in
a table, processed and so on, but when it gets deleted the corresponding lo
remains dangling, unless someone or something (eg. a trigger) takes care on
destroying it.
Is there a way of placing some kind of hook on an object's deletion? A clean
way to do a reference counting on large objects?
To avoid polluting pg_largeobjects, almost all of the image processing
functions in PostPic return a 'temporary_image' object, which is just an
alias on bytea. (Btw: I defined it using a DOMAIN. A better way?). Temporary
images can be converted back to images when needed via a cast (often there
is a variant of the function doing this automatically).

Thanks in advance for your suggestions and contribution,
Domenico.

Albe Laurenz

laurenz.albe@wien.gv.at

almost 16 years ago

In reply to: Domenico Rotiroti (#1)

Re: [RFC] Comments on PostPic project - Repost

Domenico Rotiroti wrote:

I would like to receive comments/suggestions about this
project: http://github.com/drotiro/postpic.

In short, it's an extension that enables image processing
within the database, adding a new type (image) and several functions.

The image is represented by a struct containing some
attributes (dimensions, some exif tag: shoot date, exposure
time...) and a large object holding the actual image data.
The idea is to have attributes stored directly to allow for
efficient searching, while the large object seemed a
reasonable choice to store the possibly large image data
(what are the LOBs for?).
With the current large objects implementation, when a new lo
is created it "lives" in the pg_largeobjects table, until
someone calls lo_unlink on it. In my case: I create the lo on
behalf of the user, then store its oid in the image's
internal representation. At this point, the image can be
inserted in a table, processed and so on, but when it gets
deleted the corresponding lo remains dangling, unless someone
or something (eg. a trigger) takes care on destroying it.
Is there a way of placing some kind of hook on an object's
deletion? A clean way to do a reference counting on large objects?

If you want a system with reference counts, you'd probably have
to write it yourself using triggers.

There's the "vacuumlo" contrib module that removes orphaned
large objects.

To avoid polluting pg_largeobjects, almost all of the image
processing functions in PostPic return a 'temporary_image'
object, which is just an alias on bytea. (Btw: I defined it
using a DOMAIN. A better way?). Temporary images can be
converted back to images when needed via a cast (often there
is a variant of the function doing this automatically).

Why don't you use bytea instead of large objects in the database?
That way you won't have to worry about orphaned large objects,
and you don't have to convert to bytea upon retrieval.

Yours,
Laurenz Albe

Domenico Rotiroti

domenico.rotiroti@gmail.com

almost 16 years ago

In reply to: Albe Laurenz (#2)

Re: [RFC] Comments on PostPic project - Repost

Albe Laurenz <laurenz.albe@wien.gv.at> wrote:

There's the "vacuumlo" contrib module that removes orphaned
large objects.

Thanks for pointing out. Vacuumlo looks at oid and lo columns to find
orphaned lobs, but the idea could be easily adapted/extended to looks for
image columns.

Why don't you use bytea instead of large objects in the database?
That way you won't have to worry about orphaned large objects,
and you don't have to convert to bytea upon retrieval.

When I started coding PostPic I knew little about PostgreSQL's large

objects implementation, and was used to more 'traditional' BLOB types found
in other databases.

Regards,
Domenico

Domenico Rotiroti

drotiro@tiscali.it

almost 16 years ago

In reply to: Albe Laurenz (#2)

Re: [RFC] Comments on PostPic project - Repost

On Tue, Mar 16, 2010 at 1:04 PM, Albe Laurenz <laurenz.albe@wien.gv.at>wrote:

Show quoted text

Domenico Rotiroti wrote:

I would like to receive comments/suggestions about this
project: http://github.com/drotiro/postpic.

In short, it's an extension that enables image processing
within the database, adding a new type (image) and several functions.

The image is represented by a struct containing some
attributes (dimensions, some exif tag: shoot date, exposure
time...) and a large object holding the actual image data.
The idea is to have attributes stored directly to allow for
efficient searching, while the large object seemed a
reasonable choice to store the possibly large image data
(what are the LOBs for?).
With the current large objects implementation, when a new lo
is created it "lives" in the pg_largeobjects table, until
someone calls lo_unlink on it. In my case: I create the lo on
behalf of the user, then store its oid in the image's
internal representation. At this point, the image can be
inserted in a table, processed and so on, but when it gets
deleted the corresponding lo remains dangling, unless someone
or something (eg. a trigger) takes care on destroying it.
Is there a way of placing some kind of hook on an object's
deletion? A clean way to do a reference counting on large objects?

If you want a system with reference counts, you'd probably have
to write it yourself using triggers.

There's the "vacuumlo" contrib module that removes orphaned
large objects.

To avoid polluting pg_largeobjects, almost all of the image
processing functions in PostPic return a 'temporary_image'
object, which is just an alias on bytea. (Btw: I defined it
using a DOMAIN. A better way?). Temporary images can be
converted back to images when needed via a cast (often there
is a variant of the function doing this automatically).

Why don't you use bytea instead of large objects in the database?
That way you won't have to worry about orphaned large objects,
and you don't have to convert to bytea upon retrieval.

Yours,
Laurenz Albe

Domenico Rotiroti

drotiro@tiscali.it

almost 16 years ago

In reply to: Domenico Rotiroti (#3)

Re: [RFC] Comments on PostPic project - Repost

Albe Laurenz <laurenz.albe@wien.gv.at> wrote:

There's the "vacuumlo" contrib module that removes orphaned
large objects.

Thanks for pointing out. Vacuumlo looks at oid and lo columns to find
orphaned lobs, but the idea could be easily adapted/extended to looks for
image columns.

Why don't you use bytea instead of large objects in the database?
That way you won't have to worry about orphaned large objects,
and you don't have to convert to bytea upon retrieval.

When I started coding PostPic I knew little about PostgreSQL's large

objects implementation, and was used to more 'traditional' BLOB types found
in other databases.

Regards,
Domenico