Proposal: RETURNING primary_key()

Started by Igal @ Lucee.orgalmost 10 years ago39 messages

igal@lucee.org

almost 10 years ago

THE ISSUE:

In JDBC there is a flag called RETURN_GENERATED_KEYS --
https://docs.oracle.com/javase/7/docs/api/java/sql/Statement.html#RETURN_GENERATED_KEYS

Which is left quite ambiguous, but in general it is used to return the
"generated" Primary Key on INSERT/UPDATE/DELETE -- which is mostly
useful in the case of INSERT, of course, as the other commands do not
generate a key.

We can already add RETURNING after an INSERT, e.g.

CREATE TABLE test (name TEXT, id SERIAL PRIMARY KEY);
INSERT INTO test VALUES ('PostgresQL') RETURNING id;

But the problem is that we need to know in advance the name of the "id"
column, because if we had created the table like so:

CREATE TABLE test (name TEXT, test_id SERIAL PRIMARY KEY);

Then we would need to use RETURNING "test_id" instead of "id".

The problem is that we do not always know in advance what the Primary
Key is, and therefore a solution that was implemented in the pgjdbc
driver was to append " RETURNING * " to the query, but that has its own
problems, like returning a lot of data that is not needed, etc. (you
can see a longer discussion at https://github.com/pgjdbc/pgjdbc/issues/488 )

THE PROPOSAL:

The proposal is to allow something like RETURNING primary_key() (it can
be a keyword, not necessarily a function), e.g.

INSERT INTO test VALUES ('PostgresQL') RETURNING primary_key();

Which will return a record set according to the PRIMARY KEY that is set
on the table. So if the primary is "id", then you would get a column
named "id", and if it is "test_id" you would get a column named
"test_id" with the correct values.

If the PRIMARY KEY is made of multiple column, then all of those columns
will be returned.

If the table does not have a PRIMARY KEY constraint then NULL will be
returned with some arbitrary column name.

I would go further and suggest to add a function that will return the
last primary key from a table, e.g.:

SELECT last_primary_key() FROM test;

This of course can be beneficial for many users, and not only the JDBC
community.

Thank you for your time and consideration,

Igal

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Joshua D. Drake

jd@commandprompt.com

almost 10 years ago

In reply to: Igal @ Lucee.org (#1)

Re: Proposal: RETURNING primary_key()

On 03/07/2016 12:32 PM, Igal @ Lucee.org wrote:

The problem is that we do not always know in advance what the Primary
Key is, and therefore a solution that was implemented in the pgjdbc

I agree that the problem is that you don't always know what the primary
key is.

I would argue the solution is to check before you write the query.

--
Command Prompt, Inc. http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Igal @ Lucee.org

igal@lucee.org

almost 10 years ago

In reply to: Joshua D. Drake (#2)

Re: Proposal: RETURNING primary_key()

On 3/7/2016 12:45 PM, Joshua D. Drake wrote:

I agree that the problem is that you don't always know what the
primary key is.

I would argue the solution is to check before you write the query.

Sure, that would be great, but perhaps I should have give some more context:

We have an application server which allows our developers to query
databases with simplified syntax. Our code is written in a generic way
to allow the developers that use our application server to pass whatever
query they want into the database server, whether it's SQL Server,
MySQL, Oracle, etc.

The code that we use to wrap the SQL statement, as well as the JDBC
code, has no idea about the table or its constraints, so it's not like
I'm writing my own queries, and am just being lazy at checking what the
primary key is. I just can't know what the developer has in his database.

Sure, I can probably query it via metadata tables, etc., but that would
be a much slower process.

Igal

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tom Lane

tgl@sss.pgh.pa.us

almost 10 years ago

In reply to: Igal @ Lucee.org (#3)

Re: Proposal: RETURNING primary_key()

"Igal @ Lucee.org" <igal@lucee.org> writes:

On 3/7/2016 12:45 PM, Joshua D. Drake wrote:

I agree that the problem is that you don't always know what the
primary key is.
I would argue the solution is to check before you write the query.

Yeah. I'm rather suspicious of this proposal; I do not think it's
actually very useful to return a primary-key value without any indication
of what the primary key is. There are also corner cases where it seems
pretty ill-defined. For example, suppose you do this on an inheritance
parent table that has a pkey defined, but not all its child tables do
(or maybe they do but their pkeys aren't identical to the parent's).
What should happen then?

Sure, that would be great, but perhaps I should have give some more context:
We have an application server which allows our developers to query
databases with simplified syntax. Our code is written in a generic way
to allow the developers that use our application server to pass whatever
query they want into the database server, whether it's SQL Server,
MySQL, Oracle, etc.

That's an exceptionally weak use-case to argue for this with. Unless
you can get *all* those DBMS suppliers to invent equivalent features,
you're going to have to have pkey-querying logic anyway. The argument
for bespoke syntax for it in just one DBMS seems pretty weak.

I am fairly sure, also, that all of those systems have support for the
SQL-standard information_schema views. So if you write a pkey-identifying
query against those views, you'd have some chance of a solution that
actually did work everywhere.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Igal @ Lucee.org

igal@lucee.org

almost 10 years ago

In reply to: Tom Lane (#4)

Re: Proposal: RETURNING primary_key()

On 3/7/2016 1:20 PM, Tom Lane wrote:

Yeah. I'm rather suspicious of this proposal; I do not think it's
actually very useful to return a primary-key value without any indication
of what the primary key is. There are also corner cases where it seems
pretty ill-defined. For example, suppose you do this on an inheritance
parent table that has a pkey defined, but not all its child tables do
(or maybe they do but their pkeys aren't identical to the parent's).
What should happen then?

First, thank you for your reply. I appreciate it. I do not know the
answer to that question.

That's an exceptionally weak use-case to argue for this with. Unless
you can get *all* those DBMS suppliers to invent equivalent features,
you're going to have to have pkey-querying logic anyway. The argument
for bespoke syntax for it in just one DBMS seems pretty weak.

Fair enough, but my idea was that this will be used by the JDBC driver
in this case. The other DBMS suppliers have their JDBC driver return a
value, usually it is SERIAL type. But there is no standard for the
column name. In SQL Server, for example, it is IDENTITYCOL while in
MySQL it is GENERATED_KEY.

The thing is that in SQL Server I can do, for example, "SELECT
@@identity" and get the last value that was inserted. In SQL Server,
however, Microsoft took the easy way and enforced only a single
auto-generated identity column per table. The closest thing I can do in
PostgreSQL is "SELECT lastval()" but what if there are multiple
sequences in that table?

I am fairly sure, also, that all of those systems have support for the
SQL-standard information_schema views. So if you write a pkey-identifying
query against those views, you'd have some chance of a solution that
actually did work everywhere.

The other JDBC drivers return the last SERIAL value from the table that
had the insert, so there's no issue there. Querying the
information_schema views with each INSERT will probably cause a major
performance hit.

Anyway, I trust that you know much more about databases than I do, so if
you don't think that it's a good idea, I accept that.

Best,

Igal

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Ian Barwick

ian@2ndquadrant.com

almost 10 years ago

In reply to: Igal @ Lucee.org (#1)

Re: Proposal: RETURNING primary_key()

On 08/03/16 05:32, Igal @ Lucee.org wrote:

THE ISSUE:

In JDBC there is a flag called RETURN_GENERATED_KEYS -- https://docs.oracle.com/javase/7/docs/api/java/sql/Statement.html#RETURN_GENERATED_KEYS

(...)

THE PROPOSAL:

The proposal is to allow something like RETURNING primary_key() (it can be a keyword, not
necessarily a function), e.g.

INSERT INTO test VALUES ('PostgresQL') RETURNING primary_key();

FYI something similar has been proposed before:

/messages/by-id/53953EFB.8070701@2ndquadrant.com

The linked thread might provide more insights into the issues surrounding
this proposal.

Regards

Ian Barwick

--
Ian Barwick http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Igal @ Lucee.org

igal@lucee.org

almost 10 years ago

In reply to: Ian Barwick (#6)

Re: Proposal: RETURNING primary_key()

Ian,

On 3/7/2016 4:17 PM, Ian Barwick wrote:

FYI something similar has been proposed before:

/messages/by-id/53953EFB.8070701@2ndquadrant.com

The linked thread might provide more insights into the issues surrounding
this proposal.

It's funny how I've encountered the same issue and reached the same
conclusion as you did. The main difference is that I suggested
returning NULL values instead of throwing an error.

I read through the whole thread and it seems to me like there was quite
a bit of support for that feature, with Tom still unconvinced that this
feature is useful -- but quite a few others who see the benefit in it,
especially Java users who experience that problem first hand -- and
Rushabh complaining about white space in the patch?

I'm not sure why it was not accepted at the end?

Igal

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Craig Ringer

craig@2ndquadrant.com

almost 10 years ago

In reply to: Igal @ Lucee.org (#7)

Re: Proposal: RETURNING primary_key()

On 8 March 2016 at 08:56, Igal @ Lucee.org <igal@lucee.org> wrote:

I'm not sure why it was not accepted at the end?

The biggest issue, though it might not be clear from that thread, is that
what exactly it means to "return generated keys" is poorly defined by JDBC,
and not necessarily the same thing as "return the PRIMARY KEY".

Should we return the DEFAULT on a UNIQUE column, for example?

IMO other vendors' drivers should be tested for behaviour in a variety of
cases. Ideally the JDBC test suite too. Then specify the exact behaviour of
what we need to satisfy the JDBC driver's requirements and anything else
that might be related.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Tom Lane

tgl@sss.pgh.pa.us

almost 10 years ago

In reply to: Craig Ringer (#8)

Re: Proposal: RETURNING primary_key()

Craig Ringer <craig@2ndquadrant.com> writes:

On 8 March 2016 at 08:56, Igal @ Lucee.org <igal@lucee.org> wrote:

I'm not sure why it was not accepted at the end?

The biggest issue, though it might not be clear from that thread, is that
what exactly it means to "return generated keys" is poorly defined by JDBC,
and not necessarily the same thing as "return the PRIMARY KEY".

Should we return the DEFAULT on a UNIQUE column, for example?

IMO other vendors' drivers should be tested for behaviour in a variety of
cases.

Yeah. It was asserted in the earlier thread that other vendors implement
this feature as "return the pkey", but that seems to conflict with the
plain language of the JDBC spec: generated columns are an entirely
different thing than primary key columns. So really what I'd like to see
is some work on surveying other implementations to confirm exactly what
behavior they implement. If we're to go against what the spec seems to
say, I want to see a whole lot of evidence that other people do it
consistently in a different way.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10

Robert Haas

robertmhaas@gmail.com

almost 10 years ago

In reply to: Tom Lane (#9)

Re: Proposal: RETURNING primary_key()

On Mon, Mar 7, 2016 at 11:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Craig Ringer <craig@2ndquadrant.com> writes:

On 8 March 2016 at 08:56, Igal @ Lucee.org <igal@lucee.org> wrote:

I'm not sure why it was not accepted at the end?

The biggest issue, though it might not be clear from that thread, is that
what exactly it means to "return generated keys" is poorly defined by JDBC,
and not necessarily the same thing as "return the PRIMARY KEY".

Should we return the DEFAULT on a UNIQUE column, for example?

IMO other vendors' drivers should be tested for behaviour in a variety of
cases.

Yeah. It was asserted in the earlier thread that other vendors implement
this feature as "return the pkey", but that seems to conflict with the
plain language of the JDBC spec: generated columns are an entirely
different thing than primary key columns. So really what I'd like to see
is some work on surveying other implementations to confirm exactly what
behavior they implement. If we're to go against what the spec seems to
say, I want to see a whole lot of evidence that other people do it
consistently in a different way.

I agree that some research should be done on how this works in other
systems, but I think we have a general problem with the server lacking
certain capabilities that make it easy to implement a high-quality
JDBC driver. And I think it would be good to work on figuring out how
to fix that. I feel that some of the replies on this thread were
rather hostile considering that the goal -- good connectors for the
database server -- is extremely important.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11

Igal @ Lucee.org

igal@lucee.org

almost 10 years ago

In reply to: Robert Haas (#10)

Re: Proposal: RETURNING primary_key()

On 3/8/2016 12:12 PM, Robert Haas wrote:

I agree that some research should be done on how this works in other
systems, but I think we have a general problem with the server lacking
certain capabilities that make it easy to implement a high-quality
JDBC driver. And I think it would be good to work on figuring out how
to fix that.

I will try to gather more information about the other DBMSs and drivers
and will post my findings here when I have them.

Best,

Igal

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12

Craig Ringer

craig@2ndquadrant.com

almost 10 years ago

In reply to: Igal @ Lucee.org (#11)

Re: Proposal: RETURNING primary_key()

On 9 March 2016 at 05:40, Igal @ Lucee.org <igal@lucee.org> wrote:

On 3/8/2016 12:12 PM, Robert Haas wrote:

I agree that some research should be done on how this works in other
systems, but I think we have a general problem with the server lacking
certain capabilities that make it easy to implement a high-quality
JDBC driver. And I think it would be good to work on figuring out how
to fix that.

I will try to gather more information about the other DBMSs and drivers
and will post my findings here when I have them.

Thanks. I know that's not the most fun thing to do in the world, but it's
often needed when implementing something where part of the goal is being
compatible with other vendors, etc.

Currently I suggest using Connection.prepareStatement(..., String[]
generatedKeyColumns) where possible. I realise that's not practical for all
apps, which is why supporting the int flag form better is desirable, and we
just have to figure out what exactly we should be returning...

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#13

Craig Ringer

craig@2ndquadrant.com

almost 10 years ago

In reply to: Robert Haas (#10)

Re: Proposal: RETURNING primary_key()

On 9 March 2016 at 04:12, Robert Haas <robertmhaas@gmail.com> wrote:

I think we have a general problem with the server lacking
certain capabilities that make it easy to implement a high-quality
JDBC driver. And I think it would be good to work on figuring out how
to fix that.

There are a few frustrations, to be sure, but I'm not sure there's actually
a ton server-side that drastically limits the driver.

One of the worst problems (IMO) is in the driver architecture its self. It
attempts to prevent blocking by guestimating the server's send buffer state
and its recv buffer state, trying to stop them filling and causing the
server to block on writes. It should just avoid blocking on its own send
buffer, which it can control with confidence. Or use some of Java's rather
good concurrency/threading features to simultaneously consume data from the
receive buffer and write to the send buffer when needed, like pgjdbc-ng
does. This makes making use of the pipelining features in Pg's protocol way
harder and less efficient than it should be - but then, PgJDBC still does
this better than libpq, which can't pipeline queries at all.

There certainly are server/protocol frustrations.

QUERY CANCEL RACES
---

Query cancellation sucks badly. Not because it requires a new connection,
though that's unfortunate, but because cancel is backend-level not
statement-level. A statement cancellation key returned as an immediate
response to the Execute message would be quite handy, so we could include
it in cancel requests and eliminate the race by having the cancel request
be a no-op if the statement cancel key doesn't match the currently running
statement.

EARLY CONNECTION CHARSETS
---

There's no way to know the charset of early connection error messages,
which is a flaw in the protocol that isn't specific to PgJDBC its self.
Similarly, you can't specify the text encoding of usernames, passwords, etc
sent to the server.

PER-QUERY GUCs
---

We also have no way to set GUCs per-query, and we need it for
statement_timeout. I really wish Parse and Execute messages allowed
statement-scoped GUCs to be passed at the protocol level. This would be
very, very helpful. The driver can probably work around it by fetching and
then SETing statement_timeout, running the query, then reSETing it
afterwards in a piplelined set of queries, but .... yuck. Also, log spam
galore.

GENERATED KEYS AND RETURNING
---

To get generated keys we have to hack the statement text. There's no
protocol-level equivalent, like we have for row-count limits in the v3
protocol. The ability to specify the set of returned columns at the
protocol level would be very nice. That said, hacking the statement text
isn't *too* bad, mostly because few people are going to do their own
RETURNING statement *and* request generated keys from the driver, the only
time this becomes an issue.

STRING TYPE ISSUES
---

PgJDBC can work around Pg's IMO somewhat overzealous type checks for string
types by passing string parameters as being of unknown-type. The JDBC
interface offers us no easy way to differentiate between "this parameter is
a real textual value" and "this parameter is a string representation of
something that might be another type". We can do it with setObject and
extension class wrappers, but then the user has to import the JDBC driver's
classes directly, use PgJDBC-specific API, etc. The people who have the
most problem with our current behaviour are those least able to do that,
users who're behind a query generation layer or ORM. I'd like to just make
stringtype=unspecified the default in PgJDBC and be done with it; users can
still specify an explicit cast to 'text' in the SQL if they want

PROTOCOL-LEVEL SAVEPOINTS
---

psqlODBC would benefit from protocol-level SAVEPOINT and ROLLBACK TO
SAVEPOINT, mostly to reduce logspam and parser overhead. PgJDBC would be
able to use this to emulate other DBMSes error handling behaviour too, when
requested by a client. (Yes, I know about the correctness and performance
issues, but you tell that to someone who just wants to Port Their Stuff
From Oracle But Can't Change The Code).

SERVER_VERSION_NUM
---

server_version_num should be GUC_REPORT and it's really annoying that it
isn't. I never agreed with the arguments about why that wasn't changed, and
I still want it changed.

LOST TYPMOD, NULLABILITY INFO
---

The server throws away typmod and nullability knowledge as soon as you do
anything with a column. This is frustrating for the driver's metadata API
support. Having result columns marked non-null in Describe would be handy.

LAZY BYTEA
---

The protocol offers no way to lazily fetch large values like BYTEA. Many
vendors can fetch small results and return a handle that gets larger
results from the server on-demand. This means that many clients expect that

SELECT * FROM my_table_with_100MB_bytea_column;

will not fetch all those bytea values to the client until/unless they're
actually accessed. They don't have to generate new and different queries
each time. ORMs in particular benefit from this. Ideally we'd have the
protocol-level ability to return a handle to the relevant TOAST entry that
clients can then fetch using further protocol messages on-demand so long as
they're on the same session, haven't committed or rolled back, and haven't
run another statement. This would make working with big binary objects in
the DB considerably more practical.

I'm sure there are others I haven't remembered or run into in there too.

See https://wiki.postgresql.org/wiki/Todo#Wire_Protocol_Changes for some.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#14

Robert Haas

robertmhaas@gmail.com

almost 10 years ago

In reply to: Craig Ringer (#13)

Re: Proposal: RETURNING primary_key()

On Tue, Mar 8, 2016 at 8:12 PM, Craig Ringer <craig@2ndquadrant.com> wrote:

There certainly are server/protocol frustrations.

I'm sympathetic to all of these and think we should work on fixing
them, particularly...

STRING TYPE ISSUES
---

PgJDBC can work around Pg's IMO somewhat overzealous type checks ...

This.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15

Craig Ringer

craig@2ndquadrant.com

almost 10 years ago

In reply to: Robert Haas (#14)

Re: Proposal: RETURNING primary_key()

On 9 March 2016 at 21:30, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Mar 8, 2016 at 8:12 PM, Craig Ringer <craig@2ndquadrant.com>
wrote:

There certainly are server/protocol frustrations.

I'm sympathetic to all of these and think we should work on fixing
them, particularly...

STRING TYPE ISSUES
---

PgJDBC can work around Pg's IMO somewhat overzealous type checks ...

This.

I've raised that multiple times and got nowhere. More importantly, I'm
reasonably convinced that passing string types as UNKNOWNOID is what users
generally want and expect anyway, and is consistent with what happens when
you write string literals directly in SQL. I think we should just change
PgJDBC to default to this already-optional behaviour, which is currently
controlled by the stringtype=unspecified JDBC parameter.

Other drivers have similar issues, and can fix it the same way. Maybe we
should document it somewhere, but I think this is in many ways the least
deserving of attention. Partly because clients can work around it easily,
partly because the energy input required for any change will be prohibitive
and is better spent elsewhere.

I'd *much* rather have things like query cancel cookies, per-query GUCs at
the protocol level, etc.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#16

Igal @ Lucee.org

igal@lucee.org

almost 10 years ago

In reply to: Craig Ringer (#13)

Re: Proposal: RETURNING primary_key()

On 3/8/2016 5:12 PM, Craig Ringer wrote:

One of the worst problems (IMO) is in the driver architecture its
self. It attempts to prevent blocking by guestimating the server's
send buffer state and its recv buffer state, trying to stop them
filling and causing the server to block on writes. It should just
avoid blocking on its own send buffer, which it can control with
confidence. Or use some of Java's rather good concurrency/threading
features to simultaneously consume data from the receive buffer and
write to the send buffer when needed, like pgjdbc-ng does.

Are there good reasons to use pgjdbc over pgjdbc-ng then?

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17

Craig Ringer

craig@2ndquadrant.com

almost 10 years ago

In reply to: Igal @ Lucee.org (#16)

Re: Proposal: RETURNING primary_key()

On 10 March 2016 at 00:41, Igal @ Lucee.org <igal@lucee.org> wrote:

On 3/8/2016 5:12 PM, Craig Ringer wrote:

One of the worst problems (IMO) is in the driver architecture its self.
It attempts to prevent blocking by guestimating the server's send buffer
state and its recv buffer state, trying to stop them filling and causing
the server to block on writes. It should just avoid blocking on its own
send buffer, which it can control with confidence. Or use some of Java's
rather good concurrency/threading features to simultaneously consume data
from the receive buffer and write to the send buffer when needed, like
pgjdbc-ng does.

Are there good reasons to use pgjdbc over pgjdbc-ng then?

Maturity, support for older versions (-ng just punts on support for
anything except new releases) and older JDBC specs, completeness of support
for some extensions. TBH I haven't done a ton with -ng yet.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#18

Igal @ Lucee.org

igal@lucee.org

almost 10 years ago

In reply to: Craig Ringer (#12)

Re: Proposal: RETURNING primary_key()

On 3/8/2016 4:42 PM, Craig Ringer wrote:

On 9 March 2016 at 05:40, Igal @ Lucee.org <igal@lucee.org
<mailto:igal@lucee.org>> wrote:

I will try to gather more information about the other DBMSs and
drivers and will post my findings here when I have them.

Thanks. I know that's not the most fun thing to do in the world, but
it's often needed when implementing something where part of the goal
is being compatible with other vendors, etc.

It seems that the implementations vary by the driver, and not the
server, as evidenced by the Microsoft SQL Server drivers -- I tested
both the official MS driver and the open sourced jTDS driver.

I noticed that you usually don't put html in the emails here, but I
think that it's appropriate here to show the information in a clear way
(also, according to my computer it's 2016). I hope that it will be
rendered properly:

*MySQL* *DB2* *SQL Server (MS)* *SQL Server (jTDS)* *Oracle*
*Returned Type* SET SET ROW ROW ROW
*Column Name* GENERATED_KEY [name of identity col] GENERATED_KEYS
ID ROWID
*Column Type* Unknown (numeric) integer numeric numeric ROWID
*Value* Each inserted value to identity column Each inserted value to
identity column Last inserted value to identity column Last inserted
value to identity column internal address location that does not change
on UPDATE
*Example* (1), (2) (1), (2) (2) (2) AAAE5nAABAAALCxAAM

Some notes and observations:

It's the Wild West! Each implementation does something completely
different. Even when something looks similar, e.g. the returned column
name from MySQL and SQL Server (MS), it's not: notice the plural in SQL
Server's column name, which is ironic as they only return a single
value, as opposed to MySQL which returns a SET.

This has been an "interesting experience" as it was my first exposure to
some of those DBMSs. It only reinforced my decision to choose PostgreSQL
moving forward, over the alternatives (after using SQL Server for about
20 years).

More notes on the different DBMSs:

The first thing that I tested was against *MySQL*:

CREATE TABLE IF NOT EXISTS test_jdbc(name VARCHAR(64), id SERIAL);

An insert to that table via JDBC, with int flag RETURN_GENERATED_KEYS
returns a result set with a column named "GENERATED_KEY " and type
"UNKNOWN" (as per ResultSetMetaData's getColumnTypeName()), each row in
the result set corresponded with an inserted record, so for example:

INSERT INTO test_jdbc(name) VALUES ('JDBC'), ('PostgreSQL');

returned two rows with the value of the "id" column for the inserted row
in each, e.g.

GENERATED_KEY
-------------
7
8

Trying to add multiple SERIAL columns to a table results in an error:

CREATE TABLE IF NOT EXISTS jdbc(j_name VARCHAR(64), j_id SERIAL,
id2 SERIAL)

Error Code: 1075. Incorrect table definition; there can be only one auto
column and it must be defined as a key

*SQL Server*: via the Microsoft driver

Created table with the command:

CREATE TABLE dbo.jdbc (
j_name varchar(64) NOT NULL,
j_id int IDENTITY(1,1) NOT NULL
)

Generated Keys return a single row with a column named "GENERATED_KEYS"
of type numeric, and the value is the last inserted id (i.e. sequence).
This is different from MySQL which returns a row with the id for each
inserted record.

*SQL Server*: via the jTDS driver

Generated Keys return a single row with a column named "ID" of type
numeric, and the value is the last inserted id (i.e. sequence). The
behavior is similar to the Microsoft driver, but the column name is
different.

*Oracle*:

Oracle returns the column ROWID which is of type ROWID as well:
https://docs.oracle.com/cd/B19306_01/server.102/b14200/pseudocolumns008.htm

This seems to be similar to PostgreSQL's ctid, but unlike ctid -- when I
UPDATE the record the ROWID remains unchanged.

In my test I got the value "AAAE5nAABAAALCxAAM", and when I later ran:

SELECT * FROM jdbc WHERE ROWID='AAAE5nAABAAALCxAAM';

I got the information back from that row. Updating that row does not
change its ROWID.

When I tried to insert multiple values with RETURN_GENERATED_KEYS I got
an error: java.sql.SQLSyntaxErrorException: ORA-00900: invalid SQL
statement

INSERT INTO jdbc(j_name) SELECT 'PG 9.5.0' FROM DUAL UNION SELECT
'PG 9.5.1' FROM DUAL

The rows are, however, inserted into the table. Running the same INSERT
command without RETURN_GENERATED_KEYS works without error.

(Side note: This was my first, and hopefully my last, experience with
Oracle database, and it's been a real PITA. If I had tried it out some
20 years ago then the experience would have probably led me to sell the
stock short, which would have probably ended with my bankruptcy. Go
figure...)

*IBM DB2*:

CREATE TABLE jdbc(j_name VARCHAR(64), j_id INT NOT NULL GENERATED
ALWAYS AS IDENTITY)

Generated Keys return a set with the column named "J_ID" of type
integer. One row for each inserted row.

(Side note: after wasting almost a full day setting up and connecting to
the DB2 server I realized why Oracle was so successful)

#19

Robert Haas

robertmhaas@gmail.com

almost 10 years ago

In reply to: Igal @ Lucee.org (#18)

Re: Proposal: RETURNING primary_key()

On Mar 10, 2016, at 2:07 PM, Igal @ Lucee.org <igal@lucee.org> wrote:

(Side note: This was my first, and hopefully my last, experience with Oracle database, and it's been a real PITA. If I had tried it out some 20 years ago then the experience would have probably led me to sell the stock short, which would have probably ended with my bankruptcy. Go figure...)

(Side note: after wasting almost a full day setting up and connecting to the DB2 server I realized why Oracle was so successful)

This email made me laugh.

...Robert

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20

Craig Ringer

craig@2ndquadrant.com

almost 10 years ago

In reply to: Igal @ Lucee.org (#18)

Re: Proposal: RETURNING primary_key()

On 11 March 2016 at 03:07, Igal @ Lucee.org <igal@lucee.org> wrote:

I noticed that you usually don't put html in the emails here, but I think
that it's appropriate here to show the information in a clear way (also,
according to my computer it's 2016).

Pretty sure we have at least one person here using mailreader software
that's old enough to vote in most countries, but I tend to share the
sentiment. At least when there's actually a functional reason like this :)

Thanks so much for doing this testing.

I hope that it will be rendered properly:

*MySQL* *DB2* *SQL Server (MS)* *SQL Server (jTDS)* *Oracle*
*Returned Type* SET SET ROW ROW ROW
*Column Name* GENERATED_KEY [name of identity col] GENERATED_KEYS ID ROWID
*Column Type* Unknown (numeric) integer numeric numeric ROWID
*Value* Each inserted value to identity column Each inserted value to
identity column Last inserted value to identity column Last inserted
value to identity column internal address location that does not change
on UPDATE
*Example* (1), (2) (1), (2) (2) (2) AAAE5nAABAAALCxAAM
Some notes and observations:

It's the Wild West! Each implementation does something completely
different.

I honestly didn't expect that. I knew Oracle returned ROWID, but I have to
admit I thought the others would probably just return the key column(s).

When you supply the column type, does that (with the exception of Oracle)
match the column type of the generated key col?

Did you try GENERATED ALWAYS cols (where supported), UNIQUE columns with
DEFAULTs, composite columns, etc? Part of the question for Pg is what
exactly we should and should not be returning.

(Side note: This was my first, and hopefully my last, experience with
Oracle database, and it's been a real PITA. If I had tried it out some 20
years ago then the experience would have probably led me to sell the stock
short, which would have probably ended with my bankruptcy. Go figure...)

I rather less than fondly recall my own attempts to get Oracle Express
installed and running for some test or another a while ago. Amazing that it
can be that fiddly. MS-SQL on the other hand "just worked" and dropped me
into the most gorgeously wonderful admin tool and SQL editor ever.

I wonder if any of these drivers have extension options and compat flags
that you have to turn on to get better behaviour like returning a set? Or
if they're just that limited?

Anyway, from the sounds of this we have a fair bit of freedom to define
what we want at both the Pg and driver level so long as we satisfy the
basic constraint that we should return a set of generated keys in the case
where a statement does an insert that adds rows to a table with a SERIAL
(or an owned SEQUENCE). Seems like we could do pretty much whatever we want
for multiple-generated-columns cases etc.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#21

Joshua D. Drake

jd@commandprompt.com

almost 10 years ago

In reply to: Craig Ringer (#20)

Re: Proposal: RETURNING primary_key()

On 03/10/2016 08:28 PM, Craig Ringer wrote:

On 11 March 2016 at 03:07, Igal @ Lucee.org <igal@lucee.org
<mailto:igal@lucee.org>> wrote:

I noticed that you usually don't put html in the emails here, but I
think that it's appropriate here to show the information in a clear
way (also, according to my computer it's 2016).

Pretty sure we have at least one person here using mailreader software
that's old enough to vote in most countries, but I tend to share the
sentiment. At least when there's actually a functional reason like this :)

That person needs to suck it up. Email is no longer just fixed width
text and hasn't been in a decade.

--
Command Prompt, Inc. http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#22

Vladimir Sitnikov

sitnikov.vladimir@gmail.com

almost 10 years ago

In reply to: Craig Ringer (#20)

Re: Proposal: RETURNING primary_key()

Igal, thanks for the analysis.

Craig>Part of the question for Pg is what exactly we should and should not
be returning.

I think the following might be a good starting point: return set of columns
that would identify the inserted row(s).
E.g. columns of any primary key would do. Columns of any unique key would
do as well.
"returning *" would do as well, however it would return excessive columns,
thus it would be less efficient.

I do not think it makes sense to tie "generated keys" to sequences or
things like that.

For example:
1) Consider Pg returns column_name==ABC, value==42. That would mean
client could locate exactly that row via "where ABC=42"
2) Same for multicolumn keys: Pg just returns (col1, col2) == (42, 146).
Then client would be able to locate the row via "where col1=42 and col2=146
3) If multiple unique keys present, it is fine if Pg returns one or the
another depending on the phase of the moon. Yet more compact key would be
preferable to save on bandwidth.

Does that make sense?

I think naming the resulting column(s) like "generated_key" /
"generated_keys" does not make much sense. Especially, for multi-column
keys.

If ctid was update-proof, it could could do. Unfortunately, ctid might
easily get changed.

Theoretical end-to-end (it is the only use of "generated keys" I can
imagine at the moment):
1) Client issues an insert statement, asking "generated keys"
2) Pg inserts the row, and returns resultset with "primary key" (or unique
key) columns.
3) Client stores it somewhere. For instance, passes that to UI.
4) As UI wants to update the row, client just uses those keys to identify
the row to update.

PS. Frankly speaking, I feel "generated keys" is more like a "plug & pray"
kind of API. ORMs should know the column names of the primary keys => ORMs
should use "return specific column names" API, not just "return something
generated".

Vladimir

#23

Craig Ringer

craig@2ndquadrant.com

almost 10 years ago

In reply to: Vladimir Sitnikov (#22)

Re: Proposal: RETURNING primary_key()

On 11 March 2016 at 16:00, Vladimir Sitnikov <sitnikov.vladimir@gmail.com>
wrote:

Igal, thanks for the analysis.

Craig>Part of the question for Pg is what exactly we should and should
not be returning.

I think the following might be a good starting point: return set of
columns that would identify the inserted row(s).
E.g. columns of any primary key would do. Columns of any unique key would
do as well.
"returning *" would do as well, however it would return excessive columns,
thus it would be less efficient.

I do not think it makes sense to tie "generated keys" to sequences or
things like that.

That's why (sorry, Igal) I'd like to see some more tests for cases other
than identity columns. How is GENERATED ALWAYS handled, if supported? What
about if it's on a UNIQUE column? How about a PRIMARY KEY whose value is
assigned by a DEFAULT or by a trigger?

Based on the rather funky behaviour Igal found I suspect the answer will be
"nothing much" for all of those, i.e. it just doesn't work with other
drivers/vendors. But I'd like to know. I

2) Same for multicolumn keys: Pg just returns (col1, col2) == (42, 146).
Then client would be able to locate the row via "where col1=42 and col2=146

Yeah, I was wondering about composite PKs. I think Igal focused only on
generated synthetic keys, which are after all overwhelmingly common case
when getting generated keys.

3) If multiple unique keys present, it is fine if Pg returns one or the

another depending on the phase of the moon. Yet more compact key would be
preferable to save on bandwidth.

I disagree there. Behavour must be well-defined and predictable unless it's
really unavoidable.

I think naming the resulting column(s) like "generated_key" /
"generated_keys" does not make much sense. Especially, for multi-column
keys.

Yeah. At least in PgJDBC where it's a separate resultset (IIRC), so you
have metadata that means you don't have to guess column names etc.

If ctid was update-proof, it could could do. Unfortunately, ctid might
easily get changed.

Indeed. Especially since many of the apps that want to fetch generated keys
will be connection-pool oriented apps doing optimistic concurrency control
- ORMs and the like. So they won't be able to hold the transaction that
added the row open (to hold locks and protect against vacuum) while
fetching more info about the row. That'd be quite undesirable for
performance anyway, since it'd force at least one extra round-trip; you
couldn't pipeline the query for more info about the row until you knew the
ctid of the inserted row.

using ctid is a nonstarter IMO, at least as far as the client goes.

PS. Frankly speaking, I feel "generated keys" is more like a "plug & pray"

kind of API. ORMs should know the column names of the primary keys => ORMs
should use "return specific column names" API, not just "return something
generated".

Yep. There are many "should"s. I absolutely agree that this is one of them.

One reason some clients do it this way is that earlier versions of the JDBC
API didn't have the String[] generatedKeys form of prepareStatement. So
they had to cope with not being able to ask for specific cols and getting
whatever the DB handed them.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#24

Igal @ Lucee.org

igal@lucee.org

almost 10 years ago

In reply to: Craig Ringer (#23)

Re: Proposal: RETURNING primary_key()

On 3/11/2016 12:40 AM, Craig Ringer wrote:

That's why (sorry, Igal) I'd like to see some more tests for cases
other than identity columns. How is GENERATED ALWAYS handled, if
supported? What about if it's on a UNIQUE column? How about a PRIMARY
KEY whose value is assigned by a DEFAULT or by a trigger?

I was using Oracle 11g XE, GENERATED ALWAYS was not available. This is
the code I used for Oracle:

CREATE TABLE jdbc (j_name VARCHAR2(64) NOT NULL, j_id NUMBER(10) NOT
NULL);

CREATE SEQUENCE jdbc_seq;

CREATE OR REPLACE TRIGGER jdbc_seq_trigger
BEFORE INSERT ON jdbc
FOR EACH ROW
WHEN (new.j_id IS NULL)
BEGIN
SELECT jdbc_seq.NEXTVAL
INTO :new.j_id
FROM dual;
END;
/

For DB2 the type is indeed GENERATED ALWAYS AS IDENTITY:

j_id INT GENERATED ALWAYS AS IDENTITY

Originally the name was ID but when both DB2 and MS/jTDS returned a
column named "ID" I realized that it might come from the column name, so
I modified the column name. DB2 was indeed returning the column name,
while MS/jTDS returns a column named "ID" regardless of the actual
column name.

Based on the rather funky behaviour Igal found I suspect the answer
will be "nothing much" for all of those, i.e. it just doesn't work
with other drivers/vendors. But I'd like to know.

I agree, but I can test it if you give me the SQL commands. I do want
to remove all of that horrible software from my workstation as soon as
possible, but it can wait if more testing is required.

2) Same for multicolumn keys: Pg just returns (col1, col2) ==
(42, 146). Then client would be able to locate the row via "where
col1=42 and col2=146

Yeah, I was wondering about composite PKs. I think Igal focused only
on generated synthetic keys, which are after all overwhelmingly common
case when getting generated keys.

If you give me the code that you want to test I will test it.

3) If multiple unique keys present, it is fine if Pg returns one
or the another depending on the phase of the moon. Yet more
compact key would be preferable to save on bandwidth.

I disagree there. Behavour must be well-defined and predictable unless
it's really unavoidable.

I agree with Craig.

I think naming the resulting column(s) like "generated_key" /
"generated_keys" does not make much sense. Especially, for
multi-column keys.

Yeah. At least in PgJDBC where it's a separate resultset (IIRC), so
you have metadata that means you don't have to guess column names etc.

I'm not sure how multi-column keys work. In both MySQL and SQL Server
for example, you can not have more than one SEQUENCE column, so perhaps
that's their "solution".

Igal

#25

Michael Paquier

michael.paquier@gmail.com

almost 10 years ago

In reply to: Craig Ringer (#20)

Re: Proposal: RETURNING primary_key()

On Fri, Mar 11, 2016 at 5:28 AM, Craig Ringer <craig@2ndquadrant.com> wrote:

On 11 March 2016 at 03:07, Igal @ Lucee.org <igal@lucee.org> wrote:

I noticed that you usually don't put html in the emails here, but I think that it's appropriate here to show the information in a clear way (also, according to my computer it's 2016).

Pretty sure we have at least one person here using mailreader software that's old enough to vote in most countries, but I tend to share the sentiment. At least when there's actually a functional reason like this :)

This one made me smile. Extracting a quote from another project's
slogan: All mail clients s**k. Some of them just s**k less.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#26

Gavin Flower

GavinFlower@archidevsys.co.nz

almost 10 years ago

In reply to: Michael Paquier (#25)

Re: Proposal: RETURNING primary_key()

On 12/03/16 23:27, Michael Paquier wrote:

On Fri, Mar 11, 2016 at 5:28 AM, Craig Ringer <craig@2ndquadrant.com> wrote:

On 11 March 2016 at 03:07, Igal @ Lucee.org <igal@lucee.org> wrote:

I noticed that you usually don't put html in the emails here, but I think that it's appropriate here to show the information in a clear way (also, according to my computer it's 2016).

Pretty sure we have at least one person here using mailreader software that's old enough to vote in most countries, but I tend to share the sentiment. At least when there's actually a functional reason like this :)

This one made me smile. Extracting a quote from another project's
slogan: All mail clients s**k. Some of them just s**k less.

s**k == sulk???

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#27

Dave Cramer

pg@fastcrypt.com

almost 10 years ago

In reply to: Craig Ringer (#17)

Re: Proposal: RETURNING primary_key()

On 9 March 2016 at 20:49, Craig Ringer <craig@2ndquadrant.com> wrote:

On 10 March 2016 at 00:41, Igal @ Lucee.org <igal@lucee.org> wrote:

On 3/8/2016 5:12 PM, Craig Ringer wrote:

One of the worst problems (IMO) is in the driver architecture its self.
It attempts to prevent blocking by guestimating the server's send buffer
state and its recv buffer state, trying to stop them filling and causing
the server to block on writes. It should just avoid blocking on its own
send buffer, which it can control with confidence. Or use some of Java's
rather good concurrency/threading features to simultaneously consume data
from the receive buffer and write to the send buffer when needed, like
pgjdbc-ng does.

Are there good reasons to use pgjdbc over pgjdbc-ng then?

Maturity, support for older versions (-ng just punts on support for
anything except new releases) and older JDBC specs, completeness of support
for some extensions. TBH I haven't done a ton with -ng yet.

I'd like to turn this question around. Are there good reasons to use -ng
over pgjdbc ?

As to your question, you may be interested to know that pgjdbc is more
performant than ng.

Dave Cramer

davec@postgresintl.com
www.postgresintl.com

#28

Igal @ Lucee.org

igal@lucee.org

almost 10 years ago

In reply to: Dave Cramer (#27)

Re: Proposal: RETURNING primary_key()

On 4/3/2016 8:21 AM, Dave Cramer wrote:

On 9 March 2016 at 20:49, Craig Ringer <craig@2ndquadrant.com
<mailto:craig@2ndquadrant.com>> wrote:

On 3/8/2016 5:12 PM, Craig Ringer wrote:

Are there good reasons to use pgjdbc over pgjdbc-ng then?

Maturity, support for older versions (-ng just punts on support
for anything except new releases) and older JDBC specs,
completeness of support for some extensions. TBH I haven't done a
ton with -ng yet.

I'd like to turn this question around. Are there good reasons to use
-ng over pgjdbc ?

As to your question, you may be interested to know that pgjdbc is more
performant than ng.

That's good to know, but unfortunately pgjdbc is unusable for us until
https://github.com/pgjdbc/pgjdbc/issues/488 is fixed.

Also, as I mentioned in the ticket, I can't imagine RETURNING * being
performant if, for example, I INSERT a large chunk of data like an image
data or an uploaded file.

Igal

#29

Stephen Frost

sfrost@snowman.net

almost 10 years ago

In reply to: Dave Cramer (#27)

Re: Proposal: RETURNING primary_key()

* Dave Cramer (pg@fastcrypt.com) wrote:

On 9 March 2016 at 20:49, Craig Ringer <craig@2ndquadrant.com> wrote:

On 10 March 2016 at 00:41, Igal @ Lucee.org <igal@lucee.org> wrote:

On 3/8/2016 5:12 PM, Craig Ringer wrote:

One of the worst problems (IMO) is in the driver architecture its self.
It attempts to prevent blocking by guestimating the server's send buffer
state and its recv buffer state, trying to stop them filling and causing
the server to block on writes. It should just avoid blocking on its own
send buffer, which it can control with confidence. Or use some of Java's
rather good concurrency/threading features to simultaneously consume data
from the receive buffer and write to the send buffer when needed, like
pgjdbc-ng does.

Are there good reasons to use pgjdbc over pgjdbc-ng then?

Maturity, support for older versions (-ng just punts on support for
anything except new releases) and older JDBC specs, completeness of support
for some extensions. TBH I haven't done a ton with -ng yet.

I'd like to turn this question around. Are there good reasons to use -ng
over pgjdbc ?

Not generally much of a JDBC user myself, but the inability to avoid
polling for LISTEN notifications is a pretty big annoyance, which I just
ran into with a client. I understand that -ng has a way to avoid that,
even for SSL connections.

As to your question, you may be interested to know that pgjdbc is more
performant than ng.

Interesting, good to know.

Thanks!

Stephen

#30

Dave Cramer

pg@fastcrypt.com

almost 10 years ago

In reply to: Stephen Frost (#29)

Re: Proposal: RETURNING primary_key()

On 3 April 2016 at 15:35, Stephen Frost <sfrost@snowman.net> wrote:

* Dave Cramer (pg@fastcrypt.com) wrote:

On 9 March 2016 at 20:49, Craig Ringer <craig@2ndquadrant.com> wrote:

On 10 March 2016 at 00:41, Igal @ Lucee.org <igal@lucee.org> wrote:

On 3/8/2016 5:12 PM, Craig Ringer wrote:

One of the worst problems (IMO) is in the driver architecture its

self.

It attempts to prevent blocking by guestimating the server's send

buffer

state and its recv buffer state, trying to stop them filling and

causing

the server to block on writes. It should just avoid blocking on its

own

send buffer, which it can control with confidence. Or use some of

Java's

rather good concurrency/threading features to simultaneously consume

data

from the receive buffer and write to the send buffer when needed,

like

pgjdbc-ng does.

Are there good reasons to use pgjdbc over pgjdbc-ng then?

Maturity, support for older versions (-ng just punts on support for
anything except new releases) and older JDBC specs, completeness of

support

for some extensions. TBH I haven't done a ton with -ng yet.

I'd like to turn this question around. Are there good reasons to use -ng
over pgjdbc ?

Not generally much of a JDBC user myself, but the inability to avoid
polling for LISTEN notifications is a pretty big annoyance, which I just
ran into with a client. I understand that -ng has a way to avoid that,
even for SSL connections.

Yes, it is a custom api. Easy enough to add. Is this something of interest ?

Dave Cramer

davec@postgresintl.com
www.postgresintl.com

#31

Dave Cramer

pg@fastcrypt.com

almost 10 years ago

In reply to: Igal @ Lucee.org (#28)

Re: Proposal: RETURNING primary_key()

On 3 April 2016 at 12:18, Igal @ Lucee.org <igal@lucee.org> wrote:

On 4/3/2016 8:21 AM, Dave Cramer wrote:

On 9 March 2016 at 20:49, Craig Ringer <craig@2ndquadrant.com> wrote:

On 3/8/2016 5:12 PM, Craig Ringer wrote:

Are there good reasons to use pgjdbc over pgjdbc-ng then?

Maturity, support for older versions (-ng just punts on support for
anything except new releases) and older JDBC specs, completeness of support
for some extensions. TBH I haven't done a ton with -ng yet.

I'd like to turn this question around. Are there good reasons to use -ng
over pgjdbc ?

As to your question, you may be interested to know that pgjdbc is more
performant than ng.

That's good to know, but unfortunately pgjdbc is unusable for us until
https://github.com/pgjdbc/pgjdbc/issues/488 is fixed.

Also, as I mentioned in the ticket, I can't imagine RETURNING * being
performant if, for example, I INSERT a large chunk of data like an image
data or an uploaded file.

Thanks for the reminder!

So I"m guessing the reason to use ng is to avoid returning * ?

Dave Cramer

davec@postgresintl.com
www.postgresintl.com

Show quoted text

Igal

#32

Stephen Frost

sfrost@snowman.net

almost 10 years ago

In reply to: Dave Cramer (#30)

Re: Proposal: RETURNING primary_key()

Dave,

* Dave Cramer (pg@fastcrypt.com) wrote:

On 3 April 2016 at 15:35, Stephen Frost <sfrost@snowman.net> wrote:

Not generally much of a JDBC user myself, but the inability to avoid
polling for LISTEN notifications is a pretty big annoyance, which I just
ran into with a client. I understand that -ng has a way to avoid that,
even for SSL connections.

Yes, it is a custom api. Easy enough to add. Is this something of interest ?

I'd say that there is definite interest in this and there's a lot of
conversation about it on the interwebs (stackoverflow, etc).

My understanding is that the problem is actually with the SSL library
that the JDBC driver uses and that it basically lies about if there are
bytes available for reading (claiming that there never is by always
returning zero). The -ng driver, as I understand it, uses a newer SSL
library which better supports asking if there are bytes available to
read.

Thanks!

Stephen

#33

Dave Cramer

pg@fastcrypt.com

almost 10 years ago

In reply to: Stephen Frost (#32)

Re: Proposal: RETURNING primary_key()

On 3 April 2016 at 21:56, Stephen Frost <sfrost@snowman.net> wrote:

Dave,

* Dave Cramer (pg@fastcrypt.com) wrote:

On 3 April 2016 at 15:35, Stephen Frost <sfrost@snowman.net> wrote:

Not generally much of a JDBC user myself, but the inability to avoid
polling for LISTEN notifications is a pretty big annoyance, which I

just

ran into with a client. I understand that -ng has a way to avoid that,
even for SSL connections.

Yes, it is a custom api. Easy enough to add. Is this something of

interest ?

I'd say that there is definite interest in this and there's a lot of
conversation about it on the interwebs (stackoverflow, etc).

My understanding is that the problem is actually with the SSL library
that the JDBC driver uses and that it basically lies about if there are
bytes available for reading (claiming that there never is by always
returning zero). The -ng driver, as I understand it, uses a newer SSL
library which better supports asking if there are bytes available to
read.

Hmmm. that complicates things...

Async notification is the easier part, I wasn't aware that the ssl library
had this problem though

Dave Cramer

davec@postgresintl.com
www.postgresintl.com

#34

Craig Ringer

craig@2ndquadrant.com

almost 10 years ago

In reply to: Dave Cramer (#33)

Re: Proposal: RETURNING primary_key()

On 4 April 2016 at 10:13, Dave Cramer <pg@fastcrypt.com> wrote:

Async notification is the easier part, I wasn't aware that the ssl library
had this problem though

AFAIK the issue is that even if there are bytes available on the underlying
socket, the SSL lib doesn't know if that means there are bytes readable
from the wrapped SSL socket. The traffic on the underlying socket could be
renegotiation messages or whatever.

We really need non-blocking reads.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#35

Stephen Frost

sfrost@snowman.net

almost 10 years ago

In reply to: Dave Cramer (#33)

Re: Proposal: RETURNING primary_key()

Dave,

* Dave Cramer (pg@fastcrypt.com) wrote:

On 3 April 2016 at 21:56, Stephen Frost <sfrost@snowman.net> wrote:

* Dave Cramer (pg@fastcrypt.com) wrote:

On 3 April 2016 at 15:35, Stephen Frost <sfrost@snowman.net> wrote:

Not generally much of a JDBC user myself, but the inability to avoid
polling for LISTEN notifications is a pretty big annoyance, which I

just

ran into with a client. I understand that -ng has a way to avoid that,
even for SSL connections.

Yes, it is a custom api. Easy enough to add. Is this something of

interest ?

I'd say that there is definite interest in this and there's a lot of
conversation about it on the interwebs (stackoverflow, etc).

My understanding is that the problem is actually with the SSL library
that the JDBC driver uses and that it basically lies about if there are
bytes available for reading (claiming that there never is by always
returning zero). The -ng driver, as I understand it, uses a newer SSL
library which better supports asking if there are bytes available to
read.

Hmmm. that complicates things...

Async notification is the easier part, I wasn't aware that the ssl library
had this problem though

Right. It's not sufficient to simply poll the JDBC driver to see if
there are notifications currently, you have to actually generate traffic
between the client and the server, to force the driver to read from the
SSL library and discover any notifications which have arrived from the
server. That can be done by issuing an all-whitespace command, which
the server will respond to with an EmptyQueryMessage (iirc), but you
can't simply have the Java side sit in a select() loop or similar
waiting for notifications to arrive.

Thanks!

Stephen

#36

Stephen Frost

sfrost@snowman.net

almost 10 years ago

In reply to: Craig Ringer (#34)

Re: Proposal: RETURNING primary_key()

* Craig Ringer (craig@2ndquadrant.com) wrote:

On 4 April 2016 at 10:13, Dave Cramer <pg@fastcrypt.com> wrote:

Async notification is the easier part, I wasn't aware that the ssl library
had this problem though

AFAIK the issue is that even if there are bytes available on the underlying
socket, the SSL lib doesn't know if that means there are bytes readable
from the wrapped SSL socket. The traffic on the underlying socket could be
renegotiation messages or whatever.

We really need non-blocking reads.

That would certainly be a good way to address this, but I'm guessing
it's non-trivial to implement.

Thanks!

Stephen

#37

Dave Cramer

pg@fastcrypt.com

almost 10 years ago

In reply to: Stephen Frost (#36)

Re: Proposal: RETURNING primary_key()

On 3 April 2016 at 22:20, Stephen Frost <sfrost@snowman.net> wrote:

* Craig Ringer (craig@2ndquadrant.com) wrote:

On 4 April 2016 at 10:13, Dave Cramer <pg@fastcrypt.com> wrote:

Async notification is the easier part, I wasn't aware that the ssl

library

had this problem though

AFAIK the issue is that even if there are bytes available on the

underlying

socket, the SSL lib doesn't know if that means there are bytes readable
from the wrapped SSL socket. The traffic on the underlying socket could

be

renegotiation messages or whatever.

We really need non-blocking reads.

That would certainly be a good way to address this, but I'm guessing
it's non-trivial to implement.

AFAICT, the ng driver still has to generate traffic as well.

Dave Cramer

davec@postgresintl.com
www.postgresintl.com

Show quoted text

Thanks!

Stephen

#38

Igal @ Lucee.org

igal@lucee.org

almost 10 years ago

In reply to: Dave Cramer (#31)

Re: Proposal: RETURNING primary_key()

On 4/3/2016 4:34 PM, Dave Cramer wrote:

On 4/3/2016 8:21 AM, Dave Cramer wrote:

I'd like to turn this question around. Are there good reasons to
use -ng over pgjdbc ?

As to your question, you may be interested to know that pgjdbc is
more performant than ng.

That's good to know, but unfortunately pgjdbc is unusable for us
until
https://github.com/pgjdbc/pgjdbc/issues/488 is fixed.

Also, as I mentioned in the ticket, I can't imagine RETURNING *
being performant if, for example, I INSERT a large chunk of data
like an image data or an uploaded file.

Thanks for the reminder!

So I"m guessing the reason to use ng is to avoid returning * ?

I'm not sure if you're serious or if you're just trying to be "cute".
This ticket should still be fixed. It really doesn't make any sense to
me that the driver will just blindly append "RETURNING *" to the query.

If I want to return all of the columns from an UPDATE or an INSERT --
then I will add "RETURNING *" myself. And if I don't add it, then I
probably don't want the driver to second guess me, or to think that it
knows better than I do what I want. If I wanted software that thinks
that it knows what I want better than I do -- then I would stick with
SQL Server rather than switch to Postgres.

The driver used to work until someone decided to append "RETURNING *" to
the SQL code and make it unusable in many cases.

Was there any discussion on this before it was added?

Igal Sapir
Lucee Core Developer
Lucee.org <http://lucee.org/>

#39

David G. Johnston

david.g.johnston@gmail.com

almost 10 years ago

In reply to: Igal @ Lucee.org (#38)

Re: Proposal: RETURNING primary_key()

On Sun, Apr 3, 2016 at 9:49 PM, Igal @ Lucee.org <igal@lucee.org> wrote:

On 4/3/2016 4:34 PM, Dave Cramer wrote:

On 4/3/2016 8:21 AM, Dave Cramer wrote:

I'd like to turn this question around. Are there good reasons to use -ng
over pgjdbc ?

As to your question, you may be interested to know that pgjdbc is more
performant than ng.

That's good to know, but unfortunately pgjdbc is unusable for us until
https://github.com/pgjdbc/pgjdbc/issues/488 is fixed.

Also, as I mentioned in the ticket, I can't imagine RETURNING * being
performant if, for example, I INSERT a large chunk of data like an image
data or an uploaded file.

Thanks for the reminder!

So I"m guessing the reason to use ng is to avoid returning * ?

I'm not sure if you're serious or if you're just trying to be "cute".
This ticket should still be fixed. It really doesn't make any sense to me
that the driver will just blindly append "RETURNING *" to the query.

If I want to return all of the columns from an UPDATE or an INSERT -- then
I will add "RETURNING *" myself. And if I don't add it, then I probably
don't want the driver to second guess me, or to think that it knows better
than I do what I want. If I wanted software that thinks that it knows what
I want better than I do -- then I would stick with SQL Server rather than
switch to Postgres.

The driver used to work until someone decided to append "RETURNING *" to
the SQL code and make it unusable in many cases.

Was there any discussion on this before it was added?

Except the main problem you describe is one where you WANT the driver to
be smart and understand that even though you've asked it to return
generated keys the statement you've provided it is one that incapable of
doing so. Thus you do want it to interpret what you've told it and to do
what you mean and not what you say.

Obviously the problem is solvable - you yourself have said other's have
solved it. That is one piece of good news - the other piece is that
PostgreSQL, and the JDBC driver in question, is open source software.

Somehow the driver needs to determine, reliably and ideally inexpensively,
how to effect:

"This parameter is ignored if the SQL statement is not an INSERT statement,
or an SQL statement able to return auto-generated keys (the list of such
statements is vendor-specific)."

https://docs.oracle.com/javase/7/docs/api/java/sql/Connection.html#prepareStatement(java.lang.String,%20int)

Discussions and patches exploring how to go about that are welcomed.

I do think that issue 488 needs to separate out and fix the non-conformance
to the API that is present - namely not ignoring the "int" argument when
the supplied statement is not capable (i.e., not an INSERT statement) - and
posted such (and a bit more) on the issue itself.

David J.