Using BLOBs with PostgreSQL

Started by Tim Kientzleover 25 years ago6 messages
#1Tim Kientzle
kientzle@acm.org

I'm evaluating a couple of different databases for use as
the back-end to a web-based publishing system that's currently
being developed in Java and Perl.

I want to keep _all_ of the data in the database, to
simplify future replication and data management. That
includes such data as GIF images, large HTML files,
even multi-megabyte downloadable software archives.

I've been using MySQL for initial development; it has pretty
clean and easy-to-use BLOB support. You just declare a BLOB
column type, then read and write arbitrarily large chunks of data.
In Perl, BLOB columns work just like varchar columns; in JDBC,
the getBinaryStream()/setBinaryStream() functions provide support
for streaming large data objects.

How well-supported is this functionality in PostgreSQL?
I did some early experimenting with PG, but couldn't
find any column type that would accept binary data
(apparently PG's parser chokes on null characters?).

I've heard about TOAST, but have no idea what it really
is, how to use it, or how well it performs. I'm leery
of database-specific APIs.

- Tim Kientzle

#2Martin A. Marques
martin@math.unl.edu.ar
In reply to: Tim Kientzle (#1)
Re: Using BLOBs with PostgreSQL

On Sat, 07 Oct 2000, Tim Kientzle wrote:

I've been using MySQL for initial development; it has pretty
clean and easy-to-use BLOB support. You just declare a BLOB
column type, then read and write arbitrarily large chunks of data.
In Perl, BLOB columns work just like varchar columns; in JDBC,
the getBinaryStream()/setBinaryStream() functions provide support
for streaming large data objects.

If you're talking about BLOB texts, just declare the column as text and thats
all. In the case of binary data, I don't have an idea. I only work we text
data.

How well-supported is this functionality in PostgreSQL?
I did some early experimenting with PG, but couldn't
find any column type that would accept binary data
(apparently PG's parser chokes on null characters?).

I've heard about TOAST, but have no idea what it really
is, how to use it, or how well it performs. I'm leery
of database-specific APIs.

As far as I have listen, it looks like a nice way to optimize searches in
blobs. Don't know anything else.

Saludos... :-)

--
"And I'm happy, because you make me feel good, about me." - Melvin Udall
-----------------------------------------------------------------
Mart�n Marqu�s email: martin@math.unl.edu.ar
Santa Fe - Argentina http://math.unl.edu.ar/~martin/
Administrador de sistemas en math.unl.edu.ar
-----------------------------------------------------------------

#3Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Martin A. Marques (#2)
Re: [GENERAL] Using BLOBs with PostgreSQL

On Sat, 07 Oct 2000, Tim Kientzle wrote:

I've been using MySQL for initial development; it has pretty
clean and easy-to-use BLOB support. You just declare a BLOB
column type, then read and write arbitrarily large chunks of data.
In Perl, BLOB columns work just like varchar columns; in JDBC,
the getBinaryStream()/setBinaryStream() functions provide support
for streaming large data objects.

If you're talking about BLOB texts, just declare the column as text and thats
all. In the case of binary data, I don't have an idea. I only work we text
data.

[General CC removed, hackers CC added.]

Yes, this opens a good question. We have long text fields, and bytea for
binary input with escapes. As far as I know, we can store binary values
in text fields, but we have no way of getting them in there via SQL
queries (except for bytea using escapes for the binary values).

Should be look at extending along some type of large-object style API to
get binary values into these fields? Do we create a BLOB type that can
take a file name like large objects? Seems we shouldn't use TEXT for
binary if we can help it because it is confusing, and a BLOB type would
allow selects on BLOB type to return nothing rather than blasting binary
data into the user's terminal.

When we had the 8k limit, no one would really wanted to store binary
data in there because most binary values are >8k, but now that we can
store them, should we make a way for users to get them into TOAST
values.

I have on the TODO list already:

* Add non-large-object binary field

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#4Tim Kientzle
kientzle@acm.org
In reply to: Bruce Momjian (#3)
Re: [GENERAL] Using BLOBs with PostgreSQL

Bruce Momjian wrote:

On Sat, 07 Oct 2000, Tim Kientzle wrote:

I've been using MySQL for initial development; it has pretty
clean and easy-to-use BLOB support. You just declare a BLOB
column type, then read and write arbitrarily large chunks of data.

If you're talking about BLOB texts, just declare the column as text and thats
all. In the case of binary data, I don't have an idea. I only work we text
data.

Yes, this opens a good question. We have long text fields, and bytea for
binary input with escapes. As far as I know, we can store binary values
in text fields, but we have no way of getting them in there via SQL
queries (except for bytea using escapes for the binary values).

"bytea for binary data with escapes"??

What would those escapes look like? I could not find any escape
that would put a NULL character into a bytea field.

- Tim Kientzle

#5Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tim Kientzle (#4)
Re: [GENERAL] Using BLOBs with PostgreSQL

Yes, this opens a good question. We have long text fields, and bytea for
binary input with escapes. As far as I know, we can store binary values
in text fields, but we have no way of getting them in there via SQL
queries (except for bytea using escapes for the binary values).

"bytea for binary data with escapes"??

What would those escapes look like? I could not find any escape
that would put a NULL character into a bytea field.

'a\\0b' put a, NULL, b. You need double-backslash to get one backslash
to the backend.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#5)
Re: Re: [GENERAL] Using BLOBs with PostgreSQL

Bruce Momjian <pgman@candle.pha.pa.us> writes:

"bytea for binary data with escapes"??

What would those escapes look like? I could not find any escape
that would put a NULL character into a bytea field.

'a\\0b' put a, NULL, b. You need double-backslash to get one backslash
to the backend.

Looking at byteain, it's clear that you must write 3 octal digits,
so the correct incantation is
a\000b
in COPY input data, or
a\\000b
in a quoted literal in an SQL INSERT statement (since the parser will
eat one level of backslash before the type-specific converter gets it).

This strikes me as not too unreasonable for ASCII conversion of bytea.

I believe that it should be possible to retrieve an unconverted form
of a bytea value via a BINARY CURSOR, though I haven't tried it myself.

You could also do I/O without conversion using COPY BINARY, but only if
you are superuser, and even then the surrounding file syntax is weird
and undocumented. COPY BINARY across stdin/stdout (which'd be needed
for non-superuser usage) doesn't work.

I don't think there's anything particularly wrong with toasted bytea
as a storage mechanism for non-textual BLOBs, but we'd certainly need
to do something about the COPY situation to make storing random binary
data in a bytea anything except an exercise in masochism. Also it'd
be nice if bytea wasn't quite so impoverished operator-wise.

regards, tom lane