Storing many big files in database- should I do it?

Started by Joe Krameralmost 16 years ago20 messagesgeneral
Jump to latest
#1Joe Kramer
cckramer@gmail.com

Hello,

I have a web application where users upload/share files.
After file is uploaded it is copied to S3 and all subsequent downloads
are done from there.
So in a file's lifetime it's accessed only twice- when created and
when copied to S3.

Files are documents, of different size from few kilobytes to 200
Megabytes. Number of files: thousands to hundreds of thousands.

My dilemma is - Should I store files in PGSQL database or store in
filesystem and keep only metadata in database?

I see the possible cons of using PGSQL as storage:
- more network bandwidth required comparing to access NFS-mounted filesystem ?
- if database becomes corrupt you can't recover individual files
- you can't backup live database unless you install complicated
replication add-ons
- more CPU required to store/retrieve files (comparing to filesystem access)
- size overhead, e.g. storing 1000 bytes will take 1000 bytes in
database + 100 bytes for db metadata, index, etc. with lot of files
this will be a lot of overhead.

Are these concerns valid?
Anyone had this kind of design problem and how did you solve it?

Thanks.

#2John R Pierce
pierce@hogranch.com
In reply to: Joe Kramer (#1)
Re: Storing many big files in database- should I do it?

Rod wrote:

Hello,

I have a web application where users upload/share files.
After file is uploaded it is copied to S3 and all subsequent downloads
are done from there.
So in a file's lifetime it's accessed only twice- when created and
when copied to S3.

Files are documents, of different size from few kilobytes to 200
Megabytes. Number of files: thousands to hundreds of thousands.

My dilemma is - Should I store files in PGSQL database or store in
filesystem and keep only metadata in database?

I see the possible cons of using PGSQL as storage:
- more network bandwidth required comparing to access NFS-mounted filesystem ?
- if database becomes corrupt you can't recover individual files
- you can't backup live database unless you install complicated
replication add-ons
- more CPU required to store/retrieve files (comparing to filesystem access)
- size overhead, e.g. storing 1000 bytes will take 1000 bytes in
database + 100 bytes for db metadata, index, etc. with lot of files
this will be a lot of overhead.

Are these concerns valid?
Anyone had this kind of design problem and how did you solve it?

S3 storage is not suitable for running a RDBMS.

An RDBMS wants fast low latency storage using 8k block random reads and
writes. S3 is high latency and oriented towards streaming

#3Joe Kramer
cckramer@gmail.com
In reply to: John R Pierce (#2)
Re: Storing many big files in database- should I do it?

No, I'm not storing RDBMS in S3. I didn't write that in my post.
S3 is used as CDN, only for downloading files.

Show quoted text

On Tue, Apr 27, 2010 at 6:54 PM, John R Pierce <pierce@hogranch.com> wrote:

Rod wrote:

Hello,

I have a web application where users upload/share files.
After file is uploaded it is copied to S3 and all subsequent downloads
are done from there.
So in a file's lifetime it's accessed only twice- when created and
when copied to S3.

Files are documents, of different size from few kilobytes to 200
Megabytes. Number of files: thousands to hundreds of thousands.

My dilemma is - Should I store files in PGSQL database or store in
filesystem and keep only metadata in database?

I see the possible cons of using PGSQL as storage:
- more network bandwidth required comparing to access NFS-mounted
filesystem ?
- if database becomes corrupt you can't recover individual files
- you can't backup live database unless you install complicated
replication add-ons
- more CPU required to store/retrieve files (comparing to filesystem
access)
- size overhead, e.g. storing 1000 bytes will take 1000 bytes in
database + 100 bytes for db metadata, index, etc. with lot of files
this will be a lot of overhead.

Are these concerns valid?
Anyone had this kind of design problem and how did you solve it?

S3 storage is not suitable for running a RDBMS.
An RDBMS wants fast low latency storage using 8k block random reads and
writes.  S3 is high latency and oriented towards streaming

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

In reply to: Joe Kramer (#3)
Re: Storing many big files in database- should I do it?

No, I'm not storing RDBMS in S3. I didn't write that in my post.
S3 is used as CDN, only for downloading files.

So you are storing your files on S3 ?

Why should you store those files additionally in a PostgreSQL database?

If you want to keep track of them / remember metadata, hashes will do the
job with much less memory.

Harald

--
GHUM Harald Massa
persuadere et programmare
Harald Armin Massa
Spielberger Straße 49
70435 Stuttgart
0173/9409607
no fx, no carrier pigeon
-
%s is too gigantic of an industry to bend to the whims of reality

#5Cédric Villemain
cedric.villemain.debian@gmail.com
In reply to: Joe Kramer (#1)
Re: Storing many big files in database- should I do it?

2010/4/27 Rod <cckramer@gmail.com>:

Hello,

I have a web application where users upload/share files.
After file is uploaded it is copied to S3 and all subsequent downloads
are done from there.
So in a file's lifetime it's accessed only twice- when created and
when copied to S3.

Files are documents, of different size from few kilobytes to 200
Megabytes. Number of files: thousands to hundreds of thousands.

My dilemma is - Should I store files in PGSQL database or store in
filesystem and keep only metadata in database?

I see the possible cons of using PGSQL as storage:
- more network bandwidth required comparing to access NFS-mounted filesystem ?
- if database becomes corrupt you can't recover individual files
- you can't backup live database unless you install complicated
replication add-ons
- more CPU required to store/retrieve files (comparing to filesystem access)
- size overhead, e.g. storing 1000 bytes will take 1000 bytes in
database + 100 bytes for db metadata, index, etc. with lot of files
this will be a lot of overhead.

Are these concerns valid?

yes

Anyone had this kind of design problem and how did you solve it?

store your files in a filesystem, and keep the path to the file (plus
metadata, acl, etc...) in database.

Thanks.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
Cédric Villemain

#6Joe Kramer
cckramer@gmail.com
In reply to: Massa, Harald Armin (#4)
Re: Storing many big files in database- should I do it?

S3 is not primary storage for the files, it's a distribution system.
We want to be able to switch form S3 to other CDN if required.
So, "Master" copies of files is kept on private server. Question is
should it be database of filesystem.

Show quoted text

On Tue, Apr 27, 2010 at 7:03 PM, Massa, Harald Armin <chef@ghum.de> wrote:

No, I'm not storing RDBMS in S3. I didn't write that in my post.
S3 is used as CDN, only for downloading files.

So you are storing your files on S3 ?

Why should you store those files additionally in a PostgreSQL database?

If you want to keep track of them / remember metadata, hashes will do the
job with much less memory.

Harald

--
GHUM Harald Massa
persuadere et programmare
Harald Armin Massa
Spielberger Straße 49
70435 Stuttgart
0173/9409607
no fx, no carrier pigeon
-
%s is too gigantic of an industry to bend to the whims of reality

#7Adrian von Bidder
avbidder@fortytwo.ch
In reply to: Cédric Villemain (#5)
Re: Storing many big files in database- should I do it?

On Tuesday 27 April 2010 11.17:42 Cédric Villemain wrote:

Anyone had this kind of design problem and how did you solve it?

store your files in a filesystem, and keep the path to the file (plus
metadata, acl, etc...) in database.

... and be careful that db and file storage do not go out of sync.

But if files are ever only added and possibly removed (but never changed),
this is not too hard:

* be sure to commit db transaction only after file has been written to disk
(use fsync or similar to be sure!) (For file deletions: first delete db
metadata, then delete the file.)
* be sure to detect failed writes and abort the db transaction or otherwise
properly handle errors while storing the file.
* occasionally run a clean-up to remove files that were written to
filesystem where the db metadata was not stored. Should be a rare case but
it probably will happen.

PostgreSQL support 2PC (PREPARE and then COMMIT as separate steps); you may
want to use this (PREPARE database transaction, then do filesystem
operations. If filessystem operation fails, you cann ROLLBACK the db
connection, otherwise COMMIT.) That way, you don't lose transactional
semantics.

Backup requires some more thought. I guess you could use some kind of
volume management to get filesysstem snapshots, but you have to be sure the
fs snapshot reflects the point in time when the database backup was made.
Depending on load / availability requirements you may get away with stopping
data modification at the application level for a few seconds until the db
backup has started and the filesystem snapshot has been created.

cheers
-- vbi

--
featured product: PostgreSQL - http://postgresql.org

#8Anthony
osm@inbox.org
In reply to: Cédric Villemain (#5)
Re: Storing many big files in database- should I do it?

On Tue, Apr 27, 2010 at 5:17 AM, Cédric Villemain <
cedric.villemain.debian@gmail.com> wrote:

store your files in a filesystem, and keep the path to the file (plus
metadata, acl, etc...) in database.

What type of filesystem is good for this? A filesystem with support for
storing tens of thousands of files in a single directory, or should one play
the 41/56/34/41563489.ext game?

Are there any open source systems which handle keeping a filesystem and
database in sync for this purpose, or is it a wheel that keeps getting
reinvented?

I know "store your files in a filesystem" is the best long-term solution.
But it's just so much easier to just throw everything in the database.

#9Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Anthony (#8)
Re: Storing many big files in database- should I do it?

On Tuesday 27 April 2010 5:45:43 pm Anthony wrote:

On Tue, Apr 27, 2010 at 5:17 AM, Cédric Villemain <

cedric.villemain.debian@gmail.com> wrote:

store your files in a filesystem, and keep the path to the file (plus
metadata, acl, etc...) in database.

What type of filesystem is good for this? A filesystem with support for
storing tens of thousands of files in a single directory, or should one
play the 41/56/34/41563489.ext game?

Are there any open source systems which handle keeping a filesystem and
database in sync for this purpose, or is it a wheel that keeps getting
reinvented?

I know "store your files in a filesystem" is the best long-term solution.
But it's just so much easier to just throw everything in the database.

In the for what it is worth department check out this Wiki:
http://sourceforge.net/apps/mediawiki/fuse/index.php?title=DatabaseFileSystems

--
Adrian Klaver
adrian.klaver@gmail.com

#10Cédric Villemain
cedric.villemain.debian@gmail.com
In reply to: Adrian Klaver (#9)
Re: Storing many big files in database- should I do it?

2010/4/28 Adrian Klaver <adrian.klaver@gmail.com>:

On Tuesday 27 April 2010 5:45:43 pm Anthony wrote:

On Tue, Apr 27, 2010 at 5:17 AM, Cédric Villemain <

cedric.villemain.debian@gmail.com> wrote:

store your files in a filesystem, and keep the path to the file (plus
metadata, acl, etc...) in database.

What type of filesystem is good for this?  A filesystem with support for
storing tens of thousands of files in a single directory, or should one
play the 41/56/34/41563489.ext game?

I'll prefer go with XFS or ext{3-4}. In both case with a path game.
You path game will let you handle the scalability of your uploads. (so
the first increment is the first directory) something like
1/2/3/4/foo.file 2/2/3/4/bar.file etc... You might explore a hash
function or something that split a SHA1(or other) sum of the file to
get the path.

Are there any open source systems which handle keeping a filesystem and
database in sync for this purpose, or is it a wheel that keeps getting
reinvented?

I know "store your files in a filesystem" is the best long-term solution.
But it's just so much easier to just throw everything in the database.

In the for what it is worth department check out this Wiki:
http://sourceforge.net/apps/mediawiki/fuse/index.php?title=DatabaseFileSystems

and postgres fuse also :-D

--
Adrian Klaver
adrian.klaver@gmail.com

--
Cédric Villemain

#11David Wall
d.wall@computer.org
In reply to: Cédric Villemain (#10)
Re: Storing many big files in database- should I do it?

Things to consider when /not /storing them in the DB:

1) Backups of DB are incomplete without a corresponding backup of the files.

2) No transactional integrity between filesystem and DB, so you will
have to deal with orphans from both INSERT and DELETE (assuming you
don't also update the files).

3) No built in ability for replication, such as WAL shipping

Big downside for the DB is that all large objects appear to be stored
together in pg_catalog.pg_largeobject, which seems axiomatically
troubling that you know you have lots of big data, so you then store
them together, and then worry about running out of 'loids'.

David

Show quoted text

On 4/29/2010 2:10 AM, C�dric Villemain wrote:

2010/4/28 Adrian Klaver<adrian.klaver@gmail.com>:

On Tuesday 27 April 2010 5:45:43 pm Anthony wrote:

On Tue, Apr 27, 2010 at 5:17 AM, C�dric Villemain<

cedric.villemain.debian@gmail.com> wrote:

store your files in a filesystem, and keep the path to the file (plus
metadata, acl, etc...) in database.

What type of filesystem is good for this? A filesystem with support for
storing tens of thousands of files in a single directory, or should one
play the 41/56/34/41563489.ext game?

I'll prefer go with XFS or ext{3-4}. In both case with a path game.
You path game will let you handle the scalability of your uploads. (so
the first increment is the first directory) something like
1/2/3/4/foo.file 2/2/3/4/bar.file etc... You might explore a hash
function or something that split a SHA1(or other) sum of the file to
get the path.

Are there any open source systems which handle keeping a filesystem and
database in sync for this purpose, or is it a wheel that keeps getting
reinvented?

I know "store your files in a filesystem" is the best long-term solution.
But it's just so much easier to just throw everything in the database.

In the for what it is worth department check out this Wiki:
http://sourceforge.net/apps/mediawiki/fuse/index.php?title=DatabaseFileSystems

and postgres fuse also :-D

--
Adrian Klaver
adrian.klaver@gmail.com

#12Justin Graf
justin@magwerks.com
In reply to: David Wall (#11)
Re: Storing many big files in database- should I do it?

On 4/29/2010 12:07 PM, David Wall wrote:

Big downside for the DB is that all large objects appear to be stored
together in pg_catalog.pg_largeobject, which seems axiomatically
troubling that you know you have lots of big data, so you then store
them together, and then worry about running out of 'loids'.

Huh ??? isn't that point of using bytea or text datatypes.

I could have sworn bytea does not use large object interface it uses
TOAST or have i gone insane

Many people encode the binary data in Base64 and store as text data
type?? Then never have to deal with escaping bytea data type. Which i
have found can be a pain

All legitimate Magwerks Corporation quotations are sent in a .PDF file attachment with a unique ID number generated by our proprietary quotation system. Quotations received via any other form of communication will not be honored.

CONFIDENTIALITY NOTICE: This e-mail, including attachments, may contain legally privileged, confidential or other information proprietary to Magwerks Corporation and is intended solely for the use of the individual to whom it addresses. If the reader of this e-mail is not the intended recipient or authorized agent, the reader is hereby notified that any unauthorized viewing, dissemination, distribution or copying of this e-mail is strictly prohibited. If you have received this e-mail in error, please notify the sender by replying to this message and destroy all occurrences of this e-mail immediately.
Thank you.

#13Guillaume Lelarge
guillaume@lelarge.info
In reply to: Justin Graf (#12)
Re: Storing many big files in database- should I do it?

Le 29/04/2010 18:45, Justin Graf a écrit :

On 4/29/2010 12:07 PM, David Wall wrote:

Big downside for the DB is that all large objects appear to be stored
together in pg_catalog.pg_largeobject, which seems axiomatically
troubling that you know you have lots of big data, so you then store
them together, and then worry about running out of 'loids'.

Huh ??? isn't that point of using bytea or text datatypes.

I could have sworn bytea does not use large object interface it uses
TOAST or have i gone insane

You're not insane :)

Put it another way: bytea values are not stored in the pg_largeobject
catalog.

--
Guillaume.
http://www.postgresqlfr.org
http://dalibo.com

#14David Wall
d.wall@computer.org
In reply to: Guillaume Lelarge (#13)
Re: Storing many big files in database- should I do it?

Huh ??? isn't that point of using bytea or text datatypes.

I could have sworn bytea does not use large object interface it uses
TOAST or have i gone insane

You're not insane :)

Put it another way: bytea values are not stored in the pg_largeobject
catalog.

I missed the part that BYTEA was being used since it's generally not a
good way for starting large binary data because you are right that BYTEA
requires escaping across the wire (client to backend) both directions,
which for true binary data (like compressed/encrypted data, images or
other non-text files) makes for a lot of expansion in size and related
memory.

BYTEA and TEXT both can store up to 1GB of data (max field length),
which means even less "file size" supported if you use TEXT with base64
coding. LO supports 2GB of data. In JDBC, typically BYTEA is used with
byte[] or binary stream while LOs with BLOB. I think LOs allow for
streaming with the backend, too, but not sure about that, whereas I'm
pretty sure BYTEA/TEXT move all the data together you it will be in
memory all or nothing.

Of course, to support larger file storage than 1GB or 2GB, you'll have
to create your own "toast" like capability to split them into multiple rows.

David

#15Justin Graf
justin@magwerks.com
In reply to: David Wall (#14)
Re: Storing many big files in database- should I do it?

On 4/29/2010 1:51 PM, David Wall wrote:

Put it another way: bytea values are not stored in the pg_largeobject
catalog.

I missed the part that BYTEA was being used since it's generally not a
good way for starting large binary data because you are right that
BYTEA requires escaping across the wire (client to backend) both
directions, which for true binary data (like compressed/encrypted
data, images or other non-text files) makes for a lot of expansion in
size and related memory.

BYTEA and TEXT both can store up to 1GB of data (max field length),
which means even less "file size" supported if you use TEXT with
base64 coding. LO supports 2GB of data. In JDBC, typically BYTEA is
used with byte[] or binary stream while LOs with BLOB. I think LOs
allow for streaming with the backend, too, but not sure about that,
whereas I'm pretty sure BYTEA/TEXT move all the data together you it
will be in memory all or nothing.

Of course, to support larger file storage than 1GB or 2GB, you'll have
to create your own "toast" like capability to split them into multiple
rows.

David

Outside of videos/media streams what other kind of data is going to be
1gig in size. Thats allot of data still even still today.

We all talk about 1 gig and 2 gig limits on this, but really who has
bumped into that on regular bases??? Every time i hear about that not
being big enough the person is trying to shoe horn in media files into
the database, which is insane

All legitimate Magwerks Corporation quotations are sent in a .PDF file attachment with a unique ID number generated by our proprietary quotation system. Quotations received via any other form of communication will not be honored.

CONFIDENTIALITY NOTICE: This e-mail, including attachments, may contain legally privileged, confidential or other information proprietary to Magwerks Corporation and is intended solely for the use of the individual to whom it addresses. If the reader of this e-mail is not the intended recipient or authorized agent, the reader is hereby notified that any unauthorized viewing, dissemination, distribution or copying of this e-mail is strictly prohibited. If you have received this e-mail in error, please notify the sender by replying to this message and destroy all occurrences of this e-mail immediately.
Thank you.

#16Merlin Moncure
mmoncure@gmail.com
In reply to: David Wall (#14)
Re: Storing many big files in database- should I do it?

On Thu, Apr 29, 2010 at 1:51 PM, David Wall <d.wall@computer.org> wrote:

I missed the part that BYTEA was being used since it's generally not a good
way for starting large binary data because you are right that BYTEA requires
escaping across the wire (client to backend) both directions, which for true
binary data (like compressed/encrypted data, images or other non-text files)
makes for a lot of expansion in size and related memory.

what?? postgresql supports binary data in both directions without
escaping. here is how i do it with libpqtypes:

PGbytea b;
b.data = some_pointer;
b.len = data_length;

res = PGexecf(conn, "insert into table values (%bytea*);", b);

merlin

#17Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Justin Graf (#12)
Re: Storing many big files in database- should I do it?

Justin Graf wrote:

On 4/29/2010 12:07 PM, David Wall wrote:

Big downside for the DB is that all large objects appear to be stored
together in pg_catalog.pg_largeobject, which seems axiomatically
troubling that you know you have lots of big data, so you then store
them together, and then worry about running out of 'loids'.

Huh ??? isn't that point of using bytea or text datatypes.

I could have sworn bytea does not use large object interface it uses
TOAST or have i gone insane

Each toasted object also requires an OID, so you cannot have more than 4
billion toasted attributes in a table.

I've never seen this to be a problem in real life, but if you're talking
about having that many large objects, then it will be a problem with
toast too.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#17)
Re: Storing many big files in database- should I do it?

Alvaro Herrera <alvherre@commandprompt.com> writes:

Each toasted object also requires an OID, so you cannot have more than 4
billion toasted attributes in a table.

I've never seen this to be a problem in real life, but if you're talking
about having that many large objects, then it will be a problem with
toast too.

However, that toast limit is per-table, whereas the pg_largeobject limit
is per-database. So for example if you have a partitioned table then
the toast limit only applies per partition. With large objects you'd
fall over at 4G objects (probably quite a bit less in practice) no
matter what.

regards, tom lane

#19Justin Graf
justin@magwerks.com
In reply to: Tom Lane (#18)
Re: Storing many big files in database- should I do it?

On 4/29/2010 3:18 PM, Tom Lane wrote:

Alvaro Herrera<alvherre@commandprompt.com> writes:

However, that toast limit is per-table, whereas the pg_largeobject limit
is per-database. So for example if you have a partitioned table then
the toast limit only applies per partition. With large objects you'd
fall over at 4G objects (probably quite a bit less in practice) no
matter what.

regards, tom lane

has there been any thought of doing something similar to MS filestream ????
http://msdn.microsoft.com/en-us/library/cc949109.aspx

it seems to overcome all the draw backs of storing files in the DB.

All legitimate Magwerks Corporation quotations are sent in a .PDF file attachment with a unique ID number generated by our proprietary quotation system. Quotations received via any other form of communication will not be honored.

CONFIDENTIALITY NOTICE: This e-mail, including attachments, may contain legally privileged, confidential or other information proprietary to Magwerks Corporation and is intended solely for the use of the individual to whom it addresses. If the reader of this e-mail is not the intended recipient or authorized agent, the reader is hereby notified that any unauthorized viewing, dissemination, distribution or copying of this e-mail is strictly prohibited. If you have received this e-mail in error, please notify the sender by replying to this message and destroy all occurrences of this e-mail immediately.
Thank you.

#20Scott Ribe
scott_ribe@killerbytes.com
In reply to: Justin Graf (#12)
Re: Storing many big files in database- should I do it?

On Apr 29, 2010, at 10:45 AM, Justin Graf wrote:

Many people encode the binary data in Base64 and store as text data
type?? Then never have to deal with escaping bytea data type. Which i
have found can be a pain

Damn. Wish I'd thought of that ;-)

--
Scott Ribe
scott_ribe@elevated-dev.com
http://www.elevated-dev.com/
(303) 722-0567 voice