Blobs in Postgresql
Hi all-
I am evaluating databases for use in a large project that will hold image
data as blobs. I know, everybody says to just store pointers to files on the
disk...can't do it here...the images are of a confidential nature and access
to the database (and resulting client app) will be highly restricted. The
underlying platform will likely be Linux though Solaris x86-64 has been
suggested as well.
I did some tests with MySQL and found the results very sub-par...the
standard blob field only holds 64k (they have three types of blobs for
whatever reason) and the real problem is that my uploads and downloads have
failed because of packet size issues...this can be solved somewhat with
server settings, but I get the impression that blobs are barely supported.
So turning to Postgresql, can I get any recommendations, suggestions and
tips on blob handling in the database? The image sizes will be pretty
variable, from a few kilobytes to several hundred megabytes, so I need
something that will handle the various file sizes, hopefully transparently.
Thanks for any info,
Ron
So turning to Postgresql, can I get any recommendations, suggestions and
tips on blob handling in the database? The image sizes will be pretty
variable, from a few kilobytes to several hundred megabytes, so I need
something that will handle the various file sizes, hopefully transparently.
PostgreSQL BLOB implementation is well. We used it without any
problems with images from 20K-30M.
Regards
Pavel Stehule
"Ron Olson" <tachoknight@gmail.com> writes:
Hi all-
I am evaluating databases for use in a large project that will hold image
data as blobs. I know, everybody says to just store pointers to files on the
disk...
Well not everyone. I usually do, but if you're not handling these blobs under
heavy load independent of the database (like web servers) then either approach
works.
So turning to Postgresql, can I get any recommendations, suggestions and
tips on blob handling in the database? The image sizes will be pretty
variable, from a few kilobytes to several hundred megabytes, so I need
something that will handle the various file sizes, hopefully transparently.
There are basically two options. If you are not handling data that are too
large to copy around in memory, and you don't need to upload and download the
data in chunks (usually these are the same issue) then you can just store your
images in a bytea. Postgres transparently treats *all* large variable-sized
data whether text, bytea, arrays, like a blob. It stores it in a separate
table outside the main table.
If your data can sometimes be so large that you cannot manipulate the whole
thing in memory all at once (Keep in mind that Postgres expects to be able to
handle a few copies of the data at the same time. Conservatively expect 5
simultaneous copies to have to fit in memory.) then you'll have to look into
the large object interface which is a set of functions starting with lo_*
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
On Wednesday 2007-08-15 05:52, Gregory Stark wrote:
"Ron Olson" <tachoknight@gmail.com> writes:
Hi all-
I am evaluating databases for use in a large project that will hold image
data as blobs. I know, everybody says to just store pointers to files on
the disk...Well not everyone. I usually do, but if you're not handling these blobs
under heavy load independent of the database (like web servers) then either
approach works.
I've always wondered how you keep transactions working when you only store
pointers to large data. Do you need an external transaction manager to
insure that the file doesn't get deleted when you "delete" the data via the
pointer? Do you need an external application that handles all deletes,
inserts, and updates?
On 8/15/07, Ron Olson <tachoknight@gmail.com> wrote:
Hi all-
I am evaluating databases for use in a large project that will hold image
data as blobs. I know, everybody says to just store pointers to files on the
disk...can't do it here...the images are of a confidential nature and access
to the database (and resulting client app) will be highly restricted. The
underlying platform will likely be Linux though Solaris x86-64 has been
suggested as well.I did some tests with MySQL and found the results very sub-par...the
standard blob field only holds 64k (they have three types of blobs for
whatever reason) and the real problem is that my uploads and downloads have
failed because of packet size issues...this can be solved somewhat with
server settings, but I get the impression that blobs are barely supported.So turning to Postgresql, can I get any recommendations, suggestions and
tips on blob handling in the database? The image sizes will be pretty
variable, from a few kilobytes to several hundred megabytes, so I need
something that will handle the various file sizes, hopefully transparently.
for fast performance, you should make sure to use the parameterized
interface and send in the results as binary (or use a language that
accesses the database that way). I would be nervous about storing
blobs if they were very large.
regarding the security issue, have you looked at encryption?
merlin
The language is Java. I've made some tests and they work very well for 25meg
files....works exactly the way it should, first time. MySQL had all kinds of
nasty surprises for me when I first started working with blobs, but I can
say that I took my code, changed the driver, and it all works like a champ
(mind you, this was a quick test app).
I haven't looked at encryption at the database level....is such a thing
available? I know Oracle has some form of data encryption at the database
level so the nefarious DBA with the wide mustache and black brimmed hat
always going "ah ha ha ha ha" can't make off with the data, but does
Postgres have something similar?
BTW, to put into context, the database will be designed to hold evidence
(well, photos and videos of). Thus the compelling need for some security, as
well as the variation in file sizes.
Show quoted text
On 8/17/07, Merlin Moncure <mmoncure@gmail.com> wrote:
On 8/15/07, Ron Olson <tachoknight@gmail.com> wrote:
Hi all-
I am evaluating databases for use in a large project that will hold
image
data as blobs. I know, everybody says to just store pointers to files on
the
disk...can't do it here...the images are of a confidential nature and
access
to the database (and resulting client app) will be highly restricted.
The
underlying platform will likely be Linux though Solaris x86-64 has been
suggested as well.I did some tests with MySQL and found the results very sub-par...the
standard blob field only holds 64k (they have three types of blobs for
whatever reason) and the real problem is that my uploads and downloadshave
failed because of packet size issues...this can be solved somewhat with
server settings, but I get the impression that blobs are barelysupported.
So turning to Postgresql, can I get any recommendations, suggestions and
tips on blob handling in the database? The image sizes will be pretty
variable, from a few kilobytes to several hundred megabytes, so I need
something that will handle the various file sizes, hopefullytransparently.
for fast performance, you should make sure to use the parameterized
interface and send in the results as binary (or use a language that
accesses the database that way). I would be nervous about storing
blobs if they were very large.regarding the security issue, have you looked at encryption?
merlin
On 8/18/07, Ron Olson <tachoknight@gmail.com> wrote:
The language is Java. I've made some tests and they work very well for 25meg
files....works exactly the way it should, first time. MySQL had all kinds of
nasty surprises for me when I first started working with blobs, but I can
say that I took my code, changed the driver, and it all works like a champ
(mind you, this was a quick test app).I haven't looked at encryption at the database level....is such a thing
available? I know Oracle has some form of data encryption at the database
level so the nefarious DBA with the wide mustache and black brimmed hat
always going "ah ha ha ha ha" can't make off with the data, but does
Postgres have something similar?BTW, to put into context, the database will be designed to hold evidence
(well, photos and videos of). Thus the compelling need for some security, as
well as the variation in file sizes.
Well, my assumption was that you would encrypt the data on the client
side and store it that way.
PostgreSQL has open architecture. If you wanted to do the encryption
on the server, one possible approach that jumps out at me is to write
a small C function which receives the data, encrypts the image using a
key sent by the client all (but not stored), and either stores the
encrypted image back in the database via SPI or writes it out to a
file.
There are many strategies to encrypting data...first thing to think
about is where the encryption happens, where the keys are stored, etc.
merlin
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 08/17/07 23:16, Merlin Moncure wrote:
On 8/18/07, Ron Olson <tachoknight@gmail.com> wrote:
The language is Java. I've made some tests and they work very well for 25meg
files....works exactly the way it should, first time. MySQL had all kinds of
nasty surprises for me when I first started working with blobs, but I can
say that I took my code, changed the driver, and it all works like a champ
(mind you, this was a quick test app).I haven't looked at encryption at the database level....is such a thing
available? I know Oracle has some form of data encryption at the database
level so the nefarious DBA with the wide mustache and black brimmed hat
always going "ah ha ha ha ha" can't make off with the data, but does
Postgres have something similar?BTW, to put into context, the database will be designed to hold evidence
(well, photos and videos of). Thus the compelling need for some security, as
well as the variation in file sizes.Well, my assumption was that you would encrypt the data on the client
side and store it that way.PostgreSQL has open architecture. If you wanted to do the encryption
on the server, one possible approach that jumps out at me is to write
a small C function which receives the data, encrypts the image using a
key sent by the client all (but not stored), and either stores the
encrypted image back in the database via SPI or writes it out to a
file.There are many strategies to encrypting data...first thing to think
about is where the encryption happens, where the keys are stored, etc.
Client-side encryption is important, because with server-side
encryption, you are sending the Valuable Data across the wire (or,
even worse!) wireless in cleartext form.
It's more likely that there's a packet sniffer on the network than
an Evil DBA snooping around.
- --
Ron Johnson, Jr.
Jefferson LA USA
Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
iD8DBQFGxwf7S9HxQb37XmcRAimGAJ98Kykormb63BedYknIij2xZvDgEACgw23C
eWn7JJKSs1KL9dSfVx3p/BY=
=OLl1
-----END PGP SIGNATURE-----
Ron Johnson wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1On 08/17/07 23:16, Merlin Moncure wrote:
On 8/18/07, Ron Olson <tachoknight@gmail.com> wrote:
The language is Java. I've made some tests and they work very well for 25meg
files....works exactly the way it should, first time. MySQL had all kinds of
nasty surprises for me when I first started working with blobs, but I can
say that I took my code, changed the driver, and it all works like a champ
(mind you, this was a quick test app).I haven't looked at encryption at the database level....is such a thing
available? I know Oracle has some form of data encryption at the database
level so the nefarious DBA with the wide mustache and black brimmed hat
always going "ah ha ha ha ha" can't make off with the data, but does
Postgres have something similar?BTW, to put into context, the database will be designed to hold evidence
(well, photos and videos of). Thus the compelling need for some security, as
well as the variation in file sizes.Well, my assumption was that you would encrypt the data on the client
side and store it that way.PostgreSQL has open architecture. If you wanted to do the encryption
on the server, one possible approach that jumps out at me is to write
a small C function which receives the data, encrypts the image using a
key sent by the client all (but not stored), and either stores the
encrypted image back in the database via SPI or writes it out to a
file.There are many strategies to encrypting data...first thing to think
about is where the encryption happens, where the keys are stored, etc.Client-side encryption is important, because with server-side
encryption, you are sending the Valuable Data across the wire (or,
even worse!) wireless in cleartext form.It's more likely that there's a packet sniffer on the network than
an Evil DBA snooping around.
The two options I see are -
1. the client encrypts the data and sends it to the DB
2. the client uses an SSL connection to the server to prevent snooping
and lets the DB encrypt for storage.
I would suggest looking at pgcrypto in contrib for server side encryption.
The main benefit I would see from the first is it doesn't matter if
another DB admin changes the server security settings or not. The new
guy may setup a new server and not enforce SSL connections. Of course if
the client refused non-SSL connections you can prevent that.
Either way the app provides the key to decrypt the data for viewing, so
the developers current and future must maintain the security level you
choose.
What sort of security measures are taken for viewing the data? Will each
user have a security certificate on their own USB flash drive to allow
them to view the data? which could also prevent developers from
accessing the data. Or is their password enough to allow the program to
decrypt it for them?
It would really come down to which encryption method you find easiest to
implement that provides enough security for your needs.
--
Shane Ambler
pgSQL@Sheeky.Biz
Get Sheeky @ http://Sheeky.Biz