random access - bytea

Started by Dennis Bjorklundabout 22 years ago7 messages

db@zigo.dhs.org

about 22 years ago

There have been (in the past at least) plans on adding a random access
interface to bytea fields. I don't find it in the todo, maybe it should be
added?

What kind of syntax have people thought about, to let the client read a
bytea field in a random access way? I'm thinking of something like:

BEGIN;

SELECT id, open_bytea(bytea_field, READ_ACCESS) FROM table;

... and then the client has to access the data in the same transaction,
using the ID returned by open_bytea() ...

COMMIT;

Would that be feasible? I think it looks like a clean solution. To me
it seems that accessing the data outside a transaction would not be
something we want anyway, right?

Sematically it's easier to implement read access then write access. For
write access to work with transactions one need for example the blocks in
a bytea field to be stored with the transaction id's an stuff (I don't
know these parts of pg internals well enough yet).

One other thing I don't know well enough yet is the new wire protocol. To
support something like above, would it need to be changed? I guess so if
it wasn't designed with this in mind.

--
/Dennis

Joe Conway

mail@joeconway.com

about 22 years ago

In reply to: Dennis Bjorklund (#1)

Re: random access - bytea

Dennis Bjorklund wrote:

There have been (in the past at least) plans on adding a random access
interface to bytea fields. I don't find it in the todo, maybe it should be
added?

First do this:
ALTER TABLE foo ALTER COLUMN foo_bytea SET STORAGE EXTERNAL;

see:
http://www.postgresql.org/docs/view.php?version=7.3&idoc=0&file=sql-altertable.html

Then do:
UPDATE foo SET foo_bytea = foo_bytea || '';

That will modify the bytea column so that it is stored uncompressed in
the TOAST table.

Now, simply use substr() to grab any random chunk:
SELECT substr(foo_bytea, 20000, 100) from foo where foo_id = 42;

When TOASTed columns are stored uncompressed, substring will grab just
the needed chunks from the TOAST table directly. In contrast, when the
field is stored compressed, it grabs the entire thing, uncompresses it,
then gets the piece you asked for.

HTH,

Joe

Dennis Bjorklund

db@zigo.dhs.org

about 22 years ago

In reply to: Joe Conway (#2)

Re: random access - bytea

On Sat, 25 Oct 2003, Joe Conway wrote:

That will modify the bytea column so that it is stored uncompressed in
the TOAST table.

Now, simply use substr() to grab any random chunk:
SELECT substr(foo_bytea, 20000, 100) from foo where foo_id = 42;

This would imply that every little read would have to do a scan on a table
to find the row and then to perform the substr. An open command can
be optimized a lot more, for example to cache entries that have been
opened so that it's fast to read the next 1kb or whatever you want.

Also, the above does not solve writes at all which can also be made
transaction safe and fast with a better api where you can update a part
of a field.

The above is not really a replacement of the current lo-objects.

--
/Dennis

Christopher Kings-Lynne

chriskl@familyhealth.com.au

about 22 years ago

In reply to: Dennis Bjorklund (#3)

Re: random access - bytea

This would imply that every little read would have to do a scan on a table
to find the row and then to perform the substr. An open command can
be optimized a lot more, for example to cache entries that have been
opened so that it's fast to read the next 1kb or whatever you want.

It's an index scan, so it's pretty fast...

Chris

Dennis Bjorklund

db@zigo.dhs.org

about 22 years ago

In reply to: Christopher Kings-Lynne (#4)

Re: random access - bytea

On Sun, 26 Oct 2003, Christopher Kings-Lynne wrote:

to find the row and then to perform the substr. An open command can
be optimized a lot more, for example to cache entries that have been
opened so that it's fast to read the next 1kb or whatever you want.

It's an index scan, so it's pretty fast...

And if you read a lot it will be cached, but it's still a little extra
overhead and the write case it does not solve at all.

--
/Dennis

Hannu Krosing

hannu@tm.ee

about 22 years ago

In reply to: Dennis Bjorklund (#3)

Re: random access - bytea

Dennis Bjorklund kirjutas P, 26.10.2003 kell 07:30:

On Sat, 25 Oct 2003, Joe Conway wrote:

That will modify the bytea column so that it is stored uncompressed in
the TOAST table.

Now, simply use substr() to grab any random chunk:
SELECT substr(foo_bytea, 20000, 100) from foo where foo_id = 42;

This would imply that every little read would have to do a scan on a table
to find the row and then to perform the substr. An open command can
be optimized a lot more, for example to cache entries that have been
opened so that it's fast to read the next 1kb or whatever you want.

Also, the above does not solve writes at all which can also be made
transaction safe and fast with a better api where you can update a part
of a field.

I brought it up once and Tom answered that TOAST tables are not
transaction aware, so you can't update just some parts of toasted
entities - you must always write the whole fields. So this will not be
just an api change.

The above is not really a replacement of the current lo-objects.

True.

-------------
Hannu

Dennis Bjorklund

db@zigo.dhs.org

about 22 years ago

In reply to: Hannu Krosing (#6)

Re: random access - bytea

On Sun, 26 Oct 2003, Hannu Krosing wrote:

I brought it up once and Tom answered that TOAST tables are not
transaction aware, so you can't update just some parts of toasted
entities - you must always write the whole fields. So this will not be
just an api change.

Yes, the blocks (or what one selects to work with) in a toasted field have
to have transaction stuff stored, just like tuples. That is clear if
different transactions shall have different views of the "file".

--
/Dennis