storing large amounts of text

Started by Paul Dlugalmost 26 years ago7 messagesgeneral
Jump to latest
#1Paul Dlug
paul@nerdlabs.com

I'm sure this has come up before because it just logically seems like
others would have run up against this too. Is it possible to store a
large amount of text in a row/field? I want to store articles in the
database along with information about them, sort of a content databasing
system. Can anyone reccommend a good way to go about doing this in
Postgres? Any and all help greatly appreciated.

--Paul

#2Paul Dlug
paul@nerdlabs.com
In reply to: Paul Dlug (#1)
Re: storing large amounts of text

I thought TEXT had a limit of 8K or something around that size, I would
need something larger.

Stephan Richter wrote:

Show quoted text

At 08:19 PM 4/22/00 +0000, Paul Dlug wrote:

I'm sure this has come up before because it just logically seems like
others would have run up against this too. Is it possible to store a
large amount of text in a row/field? I want to store articles in the
database along with information about them, sort of a content databasing
system. Can anyone reccommend a good way to go about doing this in
Postgres? Any and all help greatly appreciated.

The TEXT type will help you or do you look for something else?

CREATE TABLE foo (
id int8,
content text);

Regards,
Stephan
--
Stephan Richter - (901) 573-3308 - srichter@cbu.edu
CBU - Physics & Chemistry; Framework Web - Web Design & Development
PGP Key: 735E C61E 5C64 F430 4F9C 798E DCA2 07E3 E42B 5391

#3Stephan Richter
srichter@cbu.edu
In reply to: Paul Dlug (#1)
Re: storing large amounts of text

At 08:19 PM 4/22/00 +0000, Paul Dlug wrote:

I'm sure this has come up before because it just logically seems like
others would have run up against this too. Is it possible to store a
large amount of text in a row/field? I want to store articles in the
database along with information about them, sort of a content databasing
system. Can anyone reccommend a good way to go about doing this in
Postgres? Any and all help greatly appreciated.

The TEXT type will help you or do you look for something else?

CREATE TABLE foo (
id int8,
content text);

Regards,
Stephan
--
Stephan Richter - (901) 573-3308 - srichter@cbu.edu
CBU - Physics & Chemistry; Framework Web - Web Design & Development
PGP Key: 735E C61E 5C64 F430 4F9C 798E DCA2 07E3 E42B 5391

#4Titus Brown
titus@caltech.edu
In reply to: Paul Dlug (#1)
Re: storing large amounts of text

-> I'm sure this has come up before because it just logically seems like
-> others would have run up against this too. Is it possible to store a
-> large amount of text in a row/field? I want to store articles in the
-> database along with information about them, sort of a content databasing
-> system. Can anyone reccommend a good way to go about doing this in
-> Postgres? Any and all help greatly appreciated.

I'm doing this for a genome project; you need to look into large object
functionality if you want to store anything larger than 8kb in PostgreSQL.

This is pretty inconvenient to do, because it's slow to search (I had
to write my own search routine; it's at

http://www.idyll.org/~t/www-tools/

along with some other stuff for AOLserver/large objects) and you have
to use a special program to dump the large objects (again, contact me,
or refer to the archives).

Several people have recommended that you store the files themselves on
your file system and have just the indices be kept in the database. That
should work as well, and I'll actually be looking into it today.

cheers,
--titus

#5Brett W. McCoy
bmccoy@chapelperilous.net
In reply to: Paul Dlug (#2)
Re: storing large amounts of text

On Sat, 22 Apr 2000, Paul Dlug wrote:

I thought TEXT had a limit of 8K or something around that size, I would
need something larger.

You'd be better off breaking the text up into smaller fields. Another
option is to store a pathname to a file that contains the text of your
article.

Brett W. McCoy
http://www.chapelperilous.net
---------------------------------------------------------------------------
QOTD:
"He's on the same bus, but he's sure as hell got a different
ticket."

#6Ross J. Reedstrom
reedstrm@rice.edu
In reply to: Titus Brown (#4)
Re: storing large amounts of text

Just to give everyone here a head's up as to what's coming: The developers
have long known that the size limit on tuple storage (particularly as it
affects the 'text' type) needs to go away, and that the existing large
objects (lo) are less than ideal as a work around. Jan Wieck has done
the inital plan and development of a generic replacement: the Oversized
Tuple Storage Technique (TOAST).

http://www.postgresql.org/projects/devel-toast.html

This system allows automatic compression and/or external storage of
any datatype, directly in the database. It's still in development, but
Jan has recently released a snapshot, so it's past the initial planning
stages. For now, we'll all have to make do with lo or external file store,
but there is hope on the horizon...

Ross
--
Ross J. Reedstrom, Ph.D., <reedstrm@rice.edu>
NSBRI Research Scientist/Programmer
Computer and Information Technology Institute
Rice University, 6100 S. Main St., Houston, TX 77005

Show quoted text

On Sat, Apr 22, 2000 at 03:32:32PM -0700, Titus Brown wrote:

-> I'm sure this has come up before because it just logically seems like
-> others would have run up against this too. Is it possible to store a
-> large amount of text in a row/field? I want to store articles in the
-> database along with information about them, sort of a content databasing
-> system. Can anyone reccommend a good way to go about doing this in
-> Postgres? Any and all help greatly appreciated.

I'm doing this for a genome project; you need to look into large object
functionality if you want to store anything larger than 8kb in PostgreSQL.

This is pretty inconvenient to do, because it's slow to search (I had
to write my own search routine; it's at

http://www.idyll.org/~t/www-tools/

along with some other stuff for AOLserver/large objects) and you have
to use a special program to dump the large objects (again, contact me,
or refer to the archives).

Several people have recommended that you store the files themselves on
your file system and have just the indices be kept in the database. That
should work as well, and I'll actually be looking into it today.

cheers,
--titus

#7Jan Wieck
JanWieck@Yahoo.com
In reply to: Ross J. Reedstrom (#6)
Re: storing large amounts of text

Ross J. Reedstrom wrote:

Just to give everyone here a head's up as to what's coming: The developers
have long known that the size limit on tuple storage (particularly as it
affects the 'text' type) needs to go away, and that the existing large
objects (lo) are less than ideal as a work around. Jan Wieck has done
the inital plan and development of a generic replacement: the Oversized
Tuple Storage Technique (TOAST).

http://www.postgresql.org/projects/devel-toast.html

Look at

http://www.postgresql.org/~wieck

for a just uploaded snapshot of the ongoing TOAST
development. It demonstrates with a user defined CLOB type
what the toaster actually can do to store a multiple MB sized
string into a single attribute (of course, one table could
contain several CLOB columns and they get toasted as needed -
biggest first).

This system allows automatic compression and/or external storage of
any datatype, directly in the database. It's still in development, but
Jan has recently released a snapshot, so it's past the initial planning
stages. For now, we'll all have to make do with lo or external file store,
but there is hope on the horizon...

AFAIK, we'll finally have a CLOB data type upward compatible
with what's in the above snapshot. So someone in really
urgent need to get going with it please feel free to contact
me directly. The above is not meant to be used in production.
But we should be able to create BLOB in the same way too and
create a useful after 7.0 feature patch in a few weeks.

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#========================================= wieck@debis.com (Jan Wieck) #