TOAST and TEXT

Started by Chris Bitmeadover 24 years ago12 messages
#1Chris Bitmead
chris@bitmead.com

Hi,

Now that postgresql doesn't have field size limits, it seems to
me they should be good for storing large blobs, even if it means
having to uuencode them to be non-binary or whatever. I don't
like the old large object implementation, I need to store very large
numbers of objects and unless this implementation has changed
in recent times it won't cut it.

So my question is, I assume TEXT is the best data type to store
large things in, what precisely is the range of characters that
I can store in TEXT? Is it only characters ascii <= 127, or is
it only printable characters, or everything except '\0' or what?

#2Rod Taylor
rbt@barchord.com
In reply to: Chris Bitmead (#1)
Re: TOAST and TEXT

It should be noted that there is still a limit of about 1GB if I
remember correctly.
--
Rod Taylor

There are always four sides to every story: your side, their side, the
truth, and what really happened.

----- Original Message -----
From: "Chris Bitmead" <chris@bitmead.com>
To: <pgsql-hackers@postgresql.org>
Sent: Tuesday, October 09, 2001 9:33 PM
Subject: [HACKERS] TOAST and TEXT

Hi,

Now that postgresql doesn't have field size limits, it seems to
me they should be good for storing large blobs, even if it means
having to uuencode them to be non-binary or whatever. I don't
like the old large object implementation, I need to store very large
numbers of objects and unless this implementation has changed
in recent times it won't cut it.

So my question is, I assume TEXT is the best data type to store
large things in, what precisely is the range of characters that
I can store in TEXT? Is it only characters ascii <= 127, or is
it only printable characters, or everything except '\0' or what?

---------------------------(end of

broadcast)---------------------------

Show quoted text

TIP 6: Have you searched our list archives?

http://archives.postgresql.org

#3Marko Kreen
marko@l-t.ee
In reply to: Chris Bitmead (#1)
Re: TOAST and TEXT

On Wed, Oct 10, 2001 at 11:33:04AM +1000, Chris Bitmead wrote:

So my question is, I assume TEXT is the best data type to store
large things in, what precisely is the range of characters that
I can store in TEXT? Is it only characters ascii <= 127, or is
it only printable characters, or everything except '\0' or what?

text accepts everything except \0, and also various funtions
take locale/charset info into account.

Use bytea, its for 0-255, binary data. When your client
library does not support it, then base64 it in client side
and later decode() into place.

--
marko

#4Jan Wieck
JanWieck@Yahoo.com
In reply to: Rod Taylor (#2)
Re: TOAST and TEXT

Rod Taylor wrote:

It should be noted that there is still a limit of about 1GB if I
remember correctly.

You're right, there is still a practical limit on the size of
a text field. And it's usually much lower than 1GB.

The problem is that first, the (encoded) data has to be put
completely into the querystring, passed to the backend and
buffered there entirely in memory. Then it get's parsed, and
the data copied into a const node. After rewriting and
planning, a heap tuple is build, containing the third,
eventually fourth in memory copy of the data. After that, the
toaster kicks in, allocates another chunk of that size to try
to compress the data and finally slices it up for storage.

So the limit depends on how much swapspace you have and where
the per process virtual memory limit of your OS is.

In practice, sizes of up to 10 MB are no problem. So storing
typical MP3s works.

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Chris Bitmead (#1)
Re: TOAST and TEXT

Chris Bitmead <chris@bitmead.com> writes:

... I don't
like the old large object implementation, I need to store very large
numbers of objects and unless this implementation has changed
in recent times it won't cut it.

Have you looked at 7.1? AFAIK it has no particular problem with
lots of LOs.

Which is not to discourage you from going over to bytea fields instead,
if that model happens to be more convenient for your application.
But your premise above seems false.

regards, tom lane

#6Chris Bitmead
chris@bitmead.com
In reply to: Tom Lane (#5)
Re: TOAST and TEXT

Chris Bitmead <chris@bitmead.com> writes:

... I don't
like the old large object implementation, I need to store very large
numbers of objects and unless this implementation has changed
in recent times it won't cut it.

Have you looked at 7.1? AFAIK it has no particular problem with
lots of LOs.

Which is not to discourage you from going over to bytea fields instead,
if that model happens to be more convenient for your application.
But your premise above seems false.

I'm storing emails, which as we know are usually small but occasionally
huge. OK, I see in the release notes something like "store all large
objects in one table". and "pg_dump" of large objects. That sounds like
maybe LOs are now ok, although for portability with Oracle blobs it
would be nice if they could be embedded in any row or at least appear
to be so from client interface side (Java client for what I'm doing).

BTW, the postgres docs web pages says there is "no limitation" on row
size. Someone should probably update that with the info given in the
last few emails and probably integrate it in the regular doco as well.

#7Balaji Venkatesan
balaji.venkatesan@megasoft.com
In reply to: Chris Bitmead (#6)
Suitable Driver ?

HI
I have to setup PERL to interact with PGSQL.
I have taken the following steps.

1.Installation of perl_5.6.0 under Redhat Linux 7.0
2.Installation of POSTGRESQL under Redhat Linux7.0

Both are working perfectly as seperate modules.

Now I need to interface perl with PGSQL.

I need to what's the best possible soln.

I have installed latest DBI from www.cpan.org

Now i need to install DBD For PGSQL .Is
this the driver i have to work on for pgsql ?.
Or do I have any other option to connect to pgsql
from perl . Indeed i've found out an other way
to use Pg driver provided by PGSQL to interface
perl with pgsql.

I need to exactly know the difference between
use Pg ; and use DBI ; Need to which one is
proceeding towards correct direction under what circumstances.

when I tried to install DBD-Pg-0.93.tar.gz under Linux
i get

Configuring Pg
Remember to actually read the README file !
please set environment variables POSTGRES_INCLUDE and POSTGRES_LIB !

I need to know what these varibles POSTGRES_INCLUDE and POSTGRES_LIB
should point to ...

and when i tried to run perl test.pl, the program to test the
installation of the module which
comes with the tar.
I get the error

OS: linux
install_driver(Pg) failed: Can't locate DBD/Pg.pm in @INC (@INC
contains: /usr/l
ib/perl5/5.6.0/i386-linux /usr/lib/perl5/5.6.0
/usr/lib/perl5/site_perl/5.6.0/i3
86-linux /usr/lib/perl5/site_perl/5.6.0 /usr/lib/perl5/site_perl .)
at (eval 1)
line 3.
Perhaps the DBD::Pg perl module hasn't been fully installed,
or perhaps the capitalisation of 'Pg' isn't right.
Available drivers: ADO, ExampleP, Multiplex, Proxy.
at test.pl line 51

Any body who can clarify is most welcome....

with regards,
Prassanna...

#8Alex Pilosov
alex@pilosoft.com
In reply to: Balaji Venkatesan (#7)
Re: Suitable Driver ?

On Thu, 11 Oct 2001, Balaji Venkatesan wrote:

Now i need to install DBD For PGSQL .Is
this the driver i have to work on for pgsql ?.
Or do I have any other option to connect to pgsql
from perl . Indeed i've found out an other way
to use Pg driver provided by PGSQL to interface
perl with pgsql.

You need DBD::Pg, which is a DBD driver for postgres.

I need to exactly know the difference between
use Pg ; and use DBI ; Need to which one is
proceeding towards correct direction under what circumstances.

You need use DBI; and use DBD::Pg;
Pg by itself is slightly lower-level module that is similar to C interface
to postgresql.

when I tried to install DBD-Pg-0.93.tar.gz under Linux
i get

Configuring Pg
Remember to actually read the README file !
please set environment variables POSTGRES_INCLUDE and POSTGRES_LIB !

I need to know what these varibles POSTGRES_INCLUDE and POSTGRES_LIB
should point to ...

To location of your installed postgres includes' and libraries
For example:

export POSTGRES_INCLUDE=/usr/local/pgsql/include
export POSTGRES_LIB=/usr/local/pgsql/lib

-alex

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Balaji Venkatesan (#7)
Re: Suitable Driver ?

"Balaji Venkatesan" <balaji.venkatesan@megasoft.com> writes:

I have installed latest DBI from www.cpan.org
Now i need to install DBD For PGSQL .Is
this the driver i have to work on for pgsql ?.

If you want to use DBI then you should get the DBD::Pg driver from
CPAN. (Yes, it is on CPAN, even though their index page about DBD
modules didn't list it last time I looked.)

I need to exactly know the difference between
use Pg ; and use DBI ; Need to which one is

Pg is a older stand-alone driver; it's not DBI-compatible,
and it's got nothing to do with DBD::Pg.

regards, tom lane

#10Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Chris Bitmead (#6)
Re: TOAST and TEXT

BTW, the postgres docs web pages says there is "no limitation" on row
size. Someone should probably update that with the info given in the
last few emails and probably integrate it in the regular doco as well.

Although the field length is limited to 1GB, is there a row size limit?
I don't know of one. The FAQ does say below the list:

Of course, these are not actually unlimited, but limited to
available disk space and memory/swap space. Performance may suffer
when these values get unusually large.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#10)
Re: TOAST and TEXT

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Although the field length is limited to 1GB, is there a row size
limit?

Sure. 1Gb per field (hard limit) times 1600 fields (also hard limit).
In practice less, since TOAST pointers are 20bytes each at present,
meaning you can't have more than BLCKSZ/20 toasted fields in one row.

Whether this has anything to do with real applications is debatable,
however. I find it hard to visualize a table design that needs several
hundred columns that *all* need to be GB-sized.

regards, tom lane

#12Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#11)
Re: TOAST and TEXT

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Although the field length is limited to 1GB, is there a row size
limit?

Sure. 1Gb per field (hard limit) times 1600 fields (also hard limit).
In practice less, since TOAST pointers are 20bytes each at present,
meaning you can't have more than BLCKSZ/20 toasted fields in one row.

I read this as 409GB with 8k pages.

Whether this has anything to do with real applications is debatable,
however. I find it hard to visualize a table design that needs several
hundred columns that *all* need to be GB-sized.

Yes, that just makes my head hurt. Easier to just say "unlimited" and
limited by your computer's memory/disk.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026