Blobs in Postgresql

Started by Ron Olsonover 18 years ago9 messagesgeneral
Jump to latest
#1Ron Olson
tachoknight@gmail.com

Hi all-

I am evaluating databases for use in a large project that will hold image
data as blobs. I know, everybody says to just store pointers to files on the
disk...can't do it here...the images are of a confidential nature and access
to the database (and resulting client app) will be highly restricted. The
underlying platform will likely be Linux though Solaris x86-64 has been
suggested as well.

I did some tests with MySQL and found the results very sub-par...the
standard blob field only holds 64k (they have three types of blobs for
whatever reason) and the real problem is that my uploads and downloads have
failed because of packet size issues...this can be solved somewhat with
server settings, but I get the impression that blobs are barely supported.

So turning to Postgresql, can I get any recommendations, suggestions and
tips on blob handling in the database? The image sizes will be pretty
variable, from a few kilobytes to several hundred megabytes, so I need
something that will handle the various file sizes, hopefully transparently.

Thanks for any info,

Ron

#2Pavel Stehule
pavel.stehule@gmail.com
In reply to: Ron Olson (#1)
Re: Blobs in Postgresql

So turning to Postgresql, can I get any recommendations, suggestions and
tips on blob handling in the database? The image sizes will be pretty
variable, from a few kilobytes to several hundred megabytes, so I need
something that will handle the various file sizes, hopefully transparently.

PostgreSQL BLOB implementation is well. We used it without any
problems with images from 20K-30M.

Regards
Pavel Stehule

#3Bruce Momjian
bruce@momjian.us
In reply to: Ron Olson (#1)
Re: Blobs in Postgresql

"Ron Olson" <tachoknight@gmail.com> writes:

Hi all-

I am evaluating databases for use in a large project that will hold image
data as blobs. I know, everybody says to just store pointers to files on the
disk...

Well not everyone. I usually do, but if you're not handling these blobs under
heavy load independent of the database (like web servers) then either approach
works.

So turning to Postgresql, can I get any recommendations, suggestions and
tips on blob handling in the database? The image sizes will be pretty
variable, from a few kilobytes to several hundred megabytes, so I need
something that will handle the various file sizes, hopefully transparently.

There are basically two options. If you are not handling data that are too
large to copy around in memory, and you don't need to upload and download the
data in chunks (usually these are the same issue) then you can just store your
images in a bytea. Postgres transparently treats *all* large variable-sized
data whether text, bytea, arrays, like a blob. It stores it in a separate
table outside the main table.

If your data can sometimes be so large that you cannot manipulate the whole
thing in memory all at once (Keep in mind that Postgres expects to be able to
handle a few copies of the data at the same time. Conservatively expect 5
simultaneous copies to have to fit in memory.) then you'll have to look into
the large object interface which is a set of functions starting with lo_*

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com

#4Trent Shipley
trent_shipley@qwest.net
In reply to: Bruce Momjian (#3)
Re: Blobs in Postgresql

On Wednesday 2007-08-15 05:52, Gregory Stark wrote:

"Ron Olson" <tachoknight@gmail.com> writes:

Hi all-

I am evaluating databases for use in a large project that will hold image
data as blobs. I know, everybody says to just store pointers to files on
the disk...

Well not everyone. I usually do, but if you're not handling these blobs
under heavy load independent of the database (like web servers) then either
approach works.

I've always wondered how you keep transactions working when you only store
pointers to large data. Do you need an external transaction manager to
insure that the file doesn't get deleted when you "delete" the data via the
pointer? Do you need an external application that handles all deletes,
inserts, and updates?

#5Merlin Moncure
mmoncure@gmail.com
In reply to: Ron Olson (#1)
Re: Blobs in Postgresql

On 8/15/07, Ron Olson <tachoknight@gmail.com> wrote:

Hi all-

I am evaluating databases for use in a large project that will hold image
data as blobs. I know, everybody says to just store pointers to files on the
disk...can't do it here...the images are of a confidential nature and access
to the database (and resulting client app) will be highly restricted. The
underlying platform will likely be Linux though Solaris x86-64 has been
suggested as well.

I did some tests with MySQL and found the results very sub-par...the
standard blob field only holds 64k (they have three types of blobs for
whatever reason) and the real problem is that my uploads and downloads have
failed because of packet size issues...this can be solved somewhat with
server settings, but I get the impression that blobs are barely supported.

So turning to Postgresql, can I get any recommendations, suggestions and
tips on blob handling in the database? The image sizes will be pretty
variable, from a few kilobytes to several hundred megabytes, so I need
something that will handle the various file sizes, hopefully transparently.

for fast performance, you should make sure to use the parameterized
interface and send in the results as binary (or use a language that
accesses the database that way). I would be nervous about storing
blobs if they were very large.

regarding the security issue, have you looked at encryption?

merlin

#6Ron Olson
tachoknight@gmail.com
In reply to: Merlin Moncure (#5)
Re: Blobs in Postgresql

The language is Java. I've made some tests and they work very well for 25meg
files....works exactly the way it should, first time. MySQL had all kinds of
nasty surprises for me when I first started working with blobs, but I can
say that I took my code, changed the driver, and it all works like a champ
(mind you, this was a quick test app).

I haven't looked at encryption at the database level....is such a thing
available? I know Oracle has some form of data encryption at the database
level so the nefarious DBA with the wide mustache and black brimmed hat
always going "ah ha ha ha ha" can't make off with the data, but does
Postgres have something similar?

BTW, to put into context, the database will be designed to hold evidence
(well, photos and videos of). Thus the compelling need for some security, as
well as the variation in file sizes.

Show quoted text

On 8/17/07, Merlin Moncure <mmoncure@gmail.com> wrote:

On 8/15/07, Ron Olson <tachoknight@gmail.com> wrote:

Hi all-

I am evaluating databases for use in a large project that will hold

image

data as blobs. I know, everybody says to just store pointers to files on

the

disk...can't do it here...the images are of a confidential nature and

access

to the database (and resulting client app) will be highly restricted.

The

underlying platform will likely be Linux though Solaris x86-64 has been
suggested as well.

I did some tests with MySQL and found the results very sub-par...the
standard blob field only holds 64k (they have three types of blobs for
whatever reason) and the real problem is that my uploads and downloads

have

failed because of packet size issues...this can be solved somewhat with
server settings, but I get the impression that blobs are barely

supported.

So turning to Postgresql, can I get any recommendations, suggestions and
tips on blob handling in the database? The image sizes will be pretty
variable, from a few kilobytes to several hundred megabytes, so I need
something that will handle the various file sizes, hopefully

transparently.

for fast performance, you should make sure to use the parameterized
interface and send in the results as binary (or use a language that
accesses the database that way). I would be nervous about storing
blobs if they were very large.

regarding the security issue, have you looked at encryption?

merlin

#7Merlin Moncure
mmoncure@gmail.com
In reply to: Ron Olson (#6)
Re: Blobs in Postgresql

On 8/18/07, Ron Olson <tachoknight@gmail.com> wrote:

The language is Java. I've made some tests and they work very well for 25meg
files....works exactly the way it should, first time. MySQL had all kinds of
nasty surprises for me when I first started working with blobs, but I can
say that I took my code, changed the driver, and it all works like a champ
(mind you, this was a quick test app).

I haven't looked at encryption at the database level....is such a thing
available? I know Oracle has some form of data encryption at the database
level so the nefarious DBA with the wide mustache and black brimmed hat
always going "ah ha ha ha ha" can't make off with the data, but does
Postgres have something similar?

BTW, to put into context, the database will be designed to hold evidence
(well, photos and videos of). Thus the compelling need for some security, as
well as the variation in file sizes.

Well, my assumption was that you would encrypt the data on the client
side and store it that way.

PostgreSQL has open architecture. If you wanted to do the encryption
on the server, one possible approach that jumps out at me is to write
a small C function which receives the data, encrypts the image using a
key sent by the client all (but not stored), and either stores the
encrypted image back in the database via SPI or writes it out to a
file.

There are many strategies to encrypting data...first thing to think
about is where the encryption happens, where the keys are stored, etc.

merlin

#8Ron Johnson
ron.l.johnson@cox.net
In reply to: Merlin Moncure (#7)
Re: Blobs in Postgresql

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 08/17/07 23:16, Merlin Moncure wrote:

On 8/18/07, Ron Olson <tachoknight@gmail.com> wrote:

The language is Java. I've made some tests and they work very well for 25meg
files....works exactly the way it should, first time. MySQL had all kinds of
nasty surprises for me when I first started working with blobs, but I can
say that I took my code, changed the driver, and it all works like a champ
(mind you, this was a quick test app).

I haven't looked at encryption at the database level....is such a thing
available? I know Oracle has some form of data encryption at the database
level so the nefarious DBA with the wide mustache and black brimmed hat
always going "ah ha ha ha ha" can't make off with the data, but does
Postgres have something similar?

BTW, to put into context, the database will be designed to hold evidence
(well, photos and videos of). Thus the compelling need for some security, as
well as the variation in file sizes.

Well, my assumption was that you would encrypt the data on the client
side and store it that way.

PostgreSQL has open architecture. If you wanted to do the encryption
on the server, one possible approach that jumps out at me is to write
a small C function which receives the data, encrypts the image using a
key sent by the client all (but not stored), and either stores the
encrypted image back in the database via SPI or writes it out to a
file.

There are many strategies to encrypting data...first thing to think
about is where the encryption happens, where the keys are stored, etc.

Client-side encryption is important, because with server-side
encryption, you are sending the Valuable Data across the wire (or,
even worse!) wireless in cleartext form.

It's more likely that there's a packet sniffer on the network than
an Evil DBA snooping around.

- --
Ron Johnson, Jr.
Jefferson LA USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGxwf7S9HxQb37XmcRAimGAJ98Kykormb63BedYknIij2xZvDgEACgw23C
eWn7JJKSs1KL9dSfVx3p/BY=
=OLl1
-----END PGP SIGNATURE-----

#9Shane Ambler
pgsql@Sheeky.Biz
In reply to: Ron Johnson (#8)
Re: Blobs in Postgresql

Ron Johnson wrote:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 08/17/07 23:16, Merlin Moncure wrote:

On 8/18/07, Ron Olson <tachoknight@gmail.com> wrote:

The language is Java. I've made some tests and they work very well for 25meg
files....works exactly the way it should, first time. MySQL had all kinds of
nasty surprises for me when I first started working with blobs, but I can
say that I took my code, changed the driver, and it all works like a champ
(mind you, this was a quick test app).

I haven't looked at encryption at the database level....is such a thing
available? I know Oracle has some form of data encryption at the database
level so the nefarious DBA with the wide mustache and black brimmed hat
always going "ah ha ha ha ha" can't make off with the data, but does
Postgres have something similar?

BTW, to put into context, the database will be designed to hold evidence
(well, photos and videos of). Thus the compelling need for some security, as
well as the variation in file sizes.

Well, my assumption was that you would encrypt the data on the client
side and store it that way.

PostgreSQL has open architecture. If you wanted to do the encryption
on the server, one possible approach that jumps out at me is to write
a small C function which receives the data, encrypts the image using a
key sent by the client all (but not stored), and either stores the
encrypted image back in the database via SPI or writes it out to a
file.

There are many strategies to encrypting data...first thing to think
about is where the encryption happens, where the keys are stored, etc.

Client-side encryption is important, because with server-side
encryption, you are sending the Valuable Data across the wire (or,
even worse!) wireless in cleartext form.

It's more likely that there's a packet sniffer on the network than
an Evil DBA snooping around.

The two options I see are -

1. the client encrypts the data and sends it to the DB

2. the client uses an SSL connection to the server to prevent snooping
and lets the DB encrypt for storage.

I would suggest looking at pgcrypto in contrib for server side encryption.

The main benefit I would see from the first is it doesn't matter if
another DB admin changes the server security settings or not. The new
guy may setup a new server and not enforce SSL connections. Of course if
the client refused non-SSL connections you can prevent that.

Either way the app provides the key to decrypt the data for viewing, so
the developers current and future must maintain the security level you
choose.

What sort of security measures are taken for viewing the data? Will each
user have a security certificate on their own USB flash drive to allow
them to view the data? which could also prevent developers from
accessing the data. Or is their password enough to allow the program to
decrypt it for them?

It would really come down to which encryption method you find easiest to
implement that provides enough security for your needs.

--

Shane Ambler
pgSQL@Sheeky.Biz

Get Sheeky @ http://Sheeky.Biz