JDBC adaptor issue

Started by Nonameover 24 years ago6 messages

btoback@mac.com

over 24 years ago

Hi all,

I've been trying to get PostgreSQL to work with Apple's
WebObjects application server. WebObjects uses JDBC as an
interface to back-end databases, translating between SQL and a
pure object model.

I had a problem with incorrect SQL being generated and sent to
the PostgreSQL back end. After some work, I tracked it down. I
have a fix, but the fix has ramifications for the way that
others use PostgreSQL, so I decided to post here and see what
people think.

It turns out that WebObjects uses the
PreparedStatement.setCharacterStream method in order to set the
values of some character parameters in prepared statements, and
thus the generated SQL. It's not at all clear why it does this
for some parameters but not others; the reason doesn't seem to
have anything to do with the declared length of the parameters.
This seems odd, because setCharacterStream is a very
high-overhead operation, but in any case, that's what it does.

The PostgreSQL JDBC driver, however, makes the assumption that
any JDBC client class that's using the set/get...stream methods
wants to exchange information with a field that's been
explicitly typed as a BLOB. It therefore does what PostgreSQL
requires: it creates a new object containing the data, then uses
the object ID of the new object as the value to stuff into the
query. This has the effect of generating queries like

SELECT ...
WHERE some_text_field = 57909 ...

57909 is an object ID. The comparison doesn't work because
some_text_field is an ordinary char or varchar, not a BLOB.

It's kind of hard to figure out the "right" solution to this
problem. I've patched the PostgreSQL JDBC implementation of
PreparedStatement.setCharacterStream to treat any stream smaller
than 8190 bytes as a string. I chose 8190 because of the old
limit of 8192 bytes per tuple in versions prior to 7.1, so this
change is least likely to cause compatibility problems with
systems using setCharacterStream the way that the PostgreSQL
developers anticipated. I can provide the patch to anyone who
needs it.

The WebObjects use of JDBC is in line with the JDBC 2.0
specification; that spec does not place any restrictions on the
types of fields that can be accessed via get/set...stream.
Whether it's a good use is a different question, of course, but
it's still legal. My little kludge with an 8190-byte "switch" to
the old behavior really can't be the last word.

I was hoping that someone could look at the PostgreSQL back end
to see if there's any reason to keep the 8190-byte limiting
behavior in the JDBC driver. The limit needs to be removed so
that character streams and strings are symmetric in order to
comply with JDBC 2.0. The effect of switching will simply be the
possibility that the back end will have to deal with very long
(>8k) quoted strings. I got the impression from reading TOAST
project documents that all such limitations had been removed,
but I wanted to check before submitting my patch for inclusion
in the distribution.

Thanks,
-- Bruce

--------------------------------------------------------------------------
Bruce Toback Tel: (602) 996-8601| My candle burns at both ends;
OPT, Inc. (800) 858-4507| It will not last the night;
11801 N. Tatum Blvd. Ste. 142 | But ah, my foes, and oh, my
friends -
Phoenix AZ 85028 | It gives a lovely light.
btoback@optc.com | -- Edna St. Vincent Millay

Bruce Toback

btoback@mac.com

over 24 years ago

In reply to: Noname (#1)

Re: JDBC adaptor issue

On Sunday, June 24, 2001, at 10:32 PM, Barry Lind wrote:

This is an interesting problem. And I can't think a any easy
solution. But given TOAST in 7.1 the existing implementation
doesn't make sense IMHO My suggestion would be that the
get/setXXXStream methods work on TOASTed data types and
get/setBlob be used for Blobs.

That would be my preference as well.

As far as your patch, I don't see that as a generic solution.
It is equally likely that a Blob could contain less than 8190
characters, or a varchar could contain more that 8190
characters in 7.1.

It's certainly not a generic solution. I was looking for a
solution that would break fewer of the applications that rely on
the current nonstandard behavior. I'd much prefer to simply have
get/set...stream just implement the standard behavior. But not
knowing the Postgres developers' preferences when it comes to
these questions, I chose the break-fewer-existing-apps approach.

If the answer is that the Postgres developers are willing to
tell current JDBC users to switch to the Blob/Clob methods when
that's what they really mean, I'll remove the switch before
submitting the patch.

-- Bruce

Import Notes

Reply to msg id not found: 3B36CCE3.1010406@xythos.com | Resolved by subject fallback

Barry Lind

barry@xythos.com

over 24 years ago

In reply to: Bruce Toback (#2)

Re: [HACKERS] JDBC adaptor issue

Actually the problem is worse than I thought. Not only do all the
get/setXXXStream methods assume the datatype is a BLOB, but also the
get/setBytes methods. This means that it isn't possible to support
bytea as the binary datatype without also breaking some backward
compatability.

In looking at the CVS log, it appears that the stream methods were only
introduced in the 7.1 JDBC driver, since 7.1 has only been out
(production) a few months, the number of people affected will be
smaller, the setBytes() method that assumed a blob was there in 7.0, so
it is likely more people will be impacted by any change there.

thanks,
--Barry

Bruce Toback wrote:

Show quoted text

On Sunday, June 24, 2001, at 10:32 PM, Barry Lind wrote:

This is an interesting problem. And I can't think a any easy
solution. But given TOAST in 7.1 the existing implementation doesn't
make sense IMHO My suggestion would be that the get/setXXXStream
methods work on TOASTed data types and get/setBlob be used for Blobs.

That would be my preference as well.

As far as your patch, I don't see that as a generic solution. It is
equally likely that a Blob could contain less than 8190 characters, or
a varchar could contain more that 8190 characters in 7.1.

It's certainly not a generic solution. I was looking for a solution that
would break fewer of the applications that rely on the current
nonstandard behavior. I'd much prefer to simply have get/set...stream
just implement the standard behavior. But not knowing the Postgres
developers' preferences when it comes to these questions, I chose the
break-fewer-existing-apps approach.

If the answer is that the Postgres developers are willing to tell
current JDBC users to switch to the Blob/Clob methods when that's what
they really mean, I'll remove the switch before submitting the patch.

-- Bruce

Import Notes

Reference msg id not found: 200106250555.WAA22885@smtpout.mac.com | Resolved by subject fallback

Barry Lind

barry@xythos.com

over 24 years ago

In reply to: Noname (#1)

Re: [HACKERS] JDBC adaptor issue

Barry Lind wrote:

Show quoted text

This is an interesting problem. And I can't think a any easy solution.
But given TOAST in 7.1 the existing implementation doesn't make sense
IMHO My suggestion would be that the get/setXXXStream methods work on
TOASTed data types and get/setBlob be used for Blobs.

As far as your patch, I don't see that as a generic solution. It is
equally likely that a Blob could contain less than 8190 characters, or a
varchar could contain more that 8190 characters in 7.1. Using this
number as a magic switch to decide whether the driver uses the BLOB API
or not just won't work in the general case.

thanks,
--Barry

btoback@mac.com wrote:

Hi all,

I've been trying to get PostgreSQL to work with Apple's WebObjects
application server. WebObjects uses JDBC as an interface to back-end
databases, translating between SQL and a pure object model.

I had a problem with incorrect SQL being generated and sent to the
PostgreSQL back end. After some work, I tracked it down. I have a fix,
but the fix has ramifications for the way that others use PostgreSQL,
so I decided to post here and see what people think.

It turns out that WebObjects uses the
PreparedStatement.setCharacterStream method in order to set the values
of some character parameters in prepared statements, and thus the
generated SQL. It's not at all clear why it does this for some
parameters but not others; the reason doesn't seem to have anything to
do with the declared length of the parameters. This seems odd, because
setCharacterStream is a very high-overhead operation, but in any case,
that's what it does.

The PostgreSQL JDBC driver, however, makes the assumption that any
JDBC client class that's using the set/get...stream methods wants to
exchange information with a field that's been explicitly typed as a
BLOB. It therefore does what PostgreSQL requires: it creates a new
object containing the data, then uses the object ID of the new object
as the value to stuff into the query. This has the effect of
generating queries like

SELECT ...
WHERE some_text_field = 57909 ...

57909 is an object ID. The comparison doesn't work because
some_text_field is an ordinary char or varchar, not a BLOB.

It's kind of hard to figure out the "right" solution to this problem.
I've patched the PostgreSQL JDBC implementation of
PreparedStatement.setCharacterStream to treat any stream smaller than
8190 bytes as a string. I chose 8190 because of the old limit of 8192
bytes per tuple in versions prior to 7.1, so this change is least
likely to cause compatibility problems with systems using
setCharacterStream the way that the PostgreSQL developers anticipated.
I can provide the patch to anyone who needs it.

The WebObjects use of JDBC is in line with the JDBC 2.0 specification;
that spec does not place any restrictions on the types of fields that
can be accessed via get/set...stream. Whether it's a good use is a
different question, of course, but it's still legal. My little kludge
with an 8190-byte "switch" to the old behavior really can't be the
last word.

I was hoping that someone could look at the PostgreSQL back end to see
if there's any reason to keep the 8190-byte limiting behavior in the
JDBC driver. The limit needs to be removed so that character streams
and strings are symmetric in order to comply with JDBC 2.0. The effect
of switching will simply be the possibility that the back end will
have to deal with very long (>8k) quoted strings. I got the impression
from reading TOAST project documents that all such limitations had
been removed, but I wanted to check before submitting my patch for
inclusion in the distribution.

Thanks,
-- Bruce

--------------------------------------------------------------------------

Bruce Toback Tel: (602) 996-8601| My candle burns at both ends;
OPT, Inc. (800) 858-4507| It will not last the night;
11801 N. Tatum Blvd. Ste. 142 | But ah, my foes, and oh, my
friends -
Phoenix AZ 85028 | It gives a lovely light.
btoback@optc.com | -- Edna St. Vincent Millay

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Barry Lind

barry@xythos.com

over 24 years ago

In reply to: Noname (#1)

Re: [HACKERS] JDBC adaptor issue

This is an interesting problem. And I can't think a any easy
solution. But given TOAST in 7.1 the existing implementation doesn't
make sense IMHO My suggestion would be that the get/setXXXStream
methods work on TOASTed data types and get/setBlob be used for Blobs.

As far as your patch, I don't see that as a generic solution. It is
equally likely that a Blob could contain less than 8190 characters, or
a varchar could contain more that 8190 characters in 7.1. Using this
number as a magic switch to decide whether the driver uses the BLOB
API or not just won't work in the general case.

thanks,
--Barry

Show quoted text

btoback@mac.com wrote:

Hi all,

I've been trying to get PostgreSQL to work with Apple's WebObjects
application server. WebObjects uses JDBC as an interface to back-end
databases, translating between SQL and a pure object model.

I had a problem with incorrect SQL being generated and sent to the
PostgreSQL back end. After some work, I tracked it down. I have a
fix, but the fix has ramifications for the way that others use
PostgreSQL, so I decided to post here and see what people think.

It turns out that WebObjects uses the
PreparedStatement.setCharacterStream method in order to set the
values of some character parameters in prepared statements, and thus
the generated SQL. It's not at all clear why it does this for some
parameters but not others; the reason doesn't seem to have anything
to do with the declared length of the parameters. This seems odd,
because setCharacterStream is a very high-overhead operation, but in
any case, that's what it does.

The PostgreSQL JDBC driver, however, makes the assumption that any
JDBC client class that's using the set/get...stream methods wants to
exchange information with a field that's been explicitly typed as a
BLOB. It therefore does what PostgreSQL requires: it creates a new
object containing the data, then uses the object ID of the new object
as the value to stuff into the query. This has the effect of
generating queries like

SELECT ...
WHERE some_text_field = 57909 ...

57909 is an object ID. The comparison doesn't work because
some_text_field is an ordinary char or varchar, not a BLOB.

It's kind of hard to figure out the "right" solution to this problem.
I've patched the PostgreSQL JDBC implementation of
PreparedStatement.setCharacterStream to treat any stream smaller than
8190 bytes as a string. I chose 8190 because of the old limit of 8192
bytes per tuple in versions prior to 7.1, so this change is least
likely to cause compatibility problems with systems using
setCharacterStream the way that the PostgreSQL developers
anticipated. I can provide the patch to anyone who needs it.

The WebObjects use of JDBC is in line with the JDBC 2.0
specification; that spec does not place any restrictions on the types
of fields that can be accessed via get/set...stream. Whether it's a
good use is a different question, of course, but it's still legal. My
little kludge with an 8190-byte "switch" to the old behavior really
can't be the last word.

I was hoping that someone could look at the PostgreSQL back end to
see if there's any reason to keep the 8190-byte limiting behavior in
the JDBC driver. The limit needs to be removed so that character
streams and strings are symmetric in order to comply with JDBC 2.0.
The effect of switching will simply be the possibility that the back
end will have to deal with very long (>8k) quoted strings. I got the
impression from reading TOAST project documents that all such
limitations had been removed, but I wanted to check before submitting
my patch for inclusion in the distribution.

Thanks,
-- Bruce

--------------------------------------------------------------------------

Bruce Toback Tel: (602) 996-8601| My candle burns at both ends;
OPT, Inc. (800) 858-4507| It will not last the night;
11801 N. Tatum Blvd. Ste. 142 | But ah, my foes, and oh, my
friends -
Phoenix AZ 85028 | It gives a lovely light.
btoback@optc.com | -- Edna St. Vincent Millay

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Bruce Momjian

pgman@candle.pha.pa.us

over 24 years ago

In reply to: Barry Lind (#3)

Re: Re: [HACKERS] JDBC adaptor issue

Actually the problem is worse than I thought. Not only do all the
get/setXXXStream methods assume the datatype is a BLOB, but also the
get/setBytes methods. This means that it isn't possible to support
bytea as the binary datatype without also breaking some backward
compatability.

In looking at the CVS log, it appears that the stream methods were only
introduced in the 7.1 JDBC driver, since 7.1 has only been out
(production) a few months, the number of people affected will be
smaller, the setBytes() method that assumed a blob was there in 7.0, so
it is likely more people will be impacted by any change there.

If you are looking for votes, you can break backward compatibility here.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026