Frontend/backend protocol improvements proposal (request).

Started by Dmitriy Igrishinover 12 years ago14 messages
#1Dmitriy Igrishin
dmitigr@gmail.com

Hackers,

While developing a C++ client library for Postgres I felt lack of extra
information in command tags in the CommandComplete (B) message
for the following commands:
PREPARE;
DEALLOCATE;
DECLARE;
CLOSE;
LISTEN;
UNLISTEN;
SET;
RESET.
Namely, for example, users of my library can prepare statements by using
protocol directly or via PREPARE command. Since the protocol does not
supports prepared statement deallocation, I wrote a wrapper over DEALLOCATE
command. The library knows about all prepared statements and
invalidates them automatically when user performs deallocate() wrapper.
But users can go with DEALLOCATE command directly and in these cases
I need to query the database to get the list of currently prepared
statements
whenever CommandComplete message with DEALLOCATE command tag
is consumed. Moreover, I need to do it *synchronously* and this breaks
asynchronous API.
I propose to include name of the object in the CommandComplete (B)
message for the above commands.

--
// Dmitriy.

#2Albe Laurenz
laurenz.albe@wien.gv.at
In reply to: Dmitriy Igrishin (#1)
Re: Frontend/backend protocol improvements proposal (request).

Dmitriy Igrishin wrote:

Sent: Thursday, June 20, 2013 5:09 PM
To: PostgreSQL Hackers
Subject: [HACKERS] Frontend/backend protocol improvements proposal (request).

Hackers,

While developing a C++ client library for Postgres I felt lack of extra
information in command tags in the CommandComplete (B) message
for the following commands:
PREPARE;
DEALLOCATE;
DECLARE;
CLOSE;
LISTEN;
UNLISTEN;
SET;
RESET.
Namely, for example, users of my library can prepare statements by using
protocol directly or via PREPARE command. Since the protocol does not
supports prepared statement deallocation, I wrote a wrapper over DEALLOCATE
command. The library knows about all prepared statements and
invalidates them automatically when user performs deallocate() wrapper.
But users can go with DEALLOCATE command directly and in these cases
I need to query the database to get the list of currently prepared statements
whenever CommandComplete message with DEALLOCATE command tag
is consumed. Moreover, I need to do it *synchronously* and this breaks
asynchronous API.
I propose to include name of the object in the CommandComplete (B)
message for the above commands.

That would be a change in the protocol, so it's not likely to happen
soon. There is a page where proposed changes to the wire protocol
are collected: http://wiki.postgresql.org/wiki/Todo#Wire_Protocol_Changes

It seems like bad design to me to keep a list of prepared statements
on the client side when it is already kept on the server side
(accessible with the pg_prepared_statements view).

What's wrong with the following:
If the user wants to deallocate an individual prepared statement,
just send "DEALLOCATE <statement name>" to the server. If the
statement does not exist, the server will return an error.
If the user wants to deallocate all statements, just send
"DEALLOCATE ALL".
Why do you need to track prepared statements on the client side?

Yours,
Laurenz Albe

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Dmitriy Igrishin
dmitigr@gmail.com
In reply to: Albe Laurenz (#2)
Re: Frontend/backend protocol improvements proposal (request).

2013/6/21 Albe Laurenz <laurenz.albe@wien.gv.at>

Dmitriy Igrishin wrote:

Sent: Thursday, June 20, 2013 5:09 PM
To: PostgreSQL Hackers
Subject: [HACKERS] Frontend/backend protocol improvements proposal

(request).

Hackers,

While developing a C++ client library for Postgres I felt lack of extra
information in command tags in the CommandComplete (B) message
for the following commands:
PREPARE;
DEALLOCATE;
DECLARE;
CLOSE;
LISTEN;
UNLISTEN;
SET;
RESET.
Namely, for example, users of my library can prepare statements by using
protocol directly or via PREPARE command. Since the protocol does not
supports prepared statement deallocation, I wrote a wrapper over

DEALLOCATE

command. The library knows about all prepared statements and
invalidates them automatically when user performs deallocate() wrapper.
But users can go with DEALLOCATE command directly and in these cases
I need to query the database to get the list of currently prepared

statements

whenever CommandComplete message with DEALLOCATE command tag
is consumed. Moreover, I need to do it *synchronously* and this breaks
asynchronous API.
I propose to include name of the object in the CommandComplete (B)
message for the above commands.

That would be a change in the protocol, so it's not likely to happen
soon. There is a page where proposed changes to the wire protocol
are collected: http://wiki.postgresql.org/wiki/Todo#Wire_Protocol_Changes

Well, even if this proposal moves to the TODO, it would be nice.

It seems like bad design to me to keep a list of prepared statements
on the client side when it is already kept on the server side
(accessible with the pg_prepared_statements view).

What's wrong with the following:
If the user wants to deallocate an individual prepared statement,
just send "DEALLOCATE <statement name>" to the server. If the
statement does not exist, the server will return an error.
If the user wants to deallocate all statements, just send
"DEALLOCATE ALL".
Why do you need to track prepared statements on the client side?

Nothing wrong if the user wants to deal with scary and cumbersome code.
As library author, I want to help people make things simpler.
To understand me, please look at the pseudo C++ code below.

// A class designed to work with prepared statements
class Prepared_statement {
public:
// Methods to generate a Bind message, like
Prepared_statement* bind(Position, Value);
// ... and more
// Methods to send Execute message, like
void execute();
void execute_async();
};

class Connection {
public:
// many stuff ...
void close();

Prepared_statement* prepare(Name, Query);
void prepare_async(Statement);

// Make yet another instance of prepared statement.
Prepared_statement* prepared_statement(Name);

// etc.
};

The Connection class is a factory for Prepared_statement instances.
As you can see, the Connection::prepare() returns new instance of
*synchronously* prepared statement. Next, the user can bind values
and execute the statement, like this:

void f(Connection* cn)
{
// Prepare unnamed statement and execute it.
cn->prepare("SELECT $1::text")->bind(0, "Albe")->execute();
// Ps: don't worry about absence of delete; We are using smart pointers
:-)
}

But there is a another possible case:

void f(Connection* cn)
{
Prepared_statement* ps = cn->prepare("SELECT $1::text");
cn->close(); // THIS SHOULD invalidate all Prepared_statement instances
...
ps->bind(0, "Albe"); // ... to throw the exception here
}

Moreover, consider:

void f(Connection* cn)
{
Prepared_statement* ps1 = cn->prepare("ps1", "SELECT $1::text");
cn->deallocate("ps1"); // THIS invalidates ps1 object...
ps1->bind(0, "Albe"); // ... to throw the exception here

Prepared_statement* ps2 = cn->prepare("ps2", "SELECT $1::text");
cn->perform("DEALLOCATE ps2"); // THIS SHOULD ALSO invalidate ps2
object...
ps2->bind(0, "Albe"); // ... to throw the exception here
}

In the latter case when the user deallocates named prepared statement
directly,
the implementation of Connection can invalidates the prepared statement
(ps2) by
analyzing and parsing CommandComplete command tag to get it's name.

And please note, that the user can send DEALLOCATE asynchronously. And
there is
only two ways to get the prepared statement (or another session object's)
name:
1) Parse the SQL command which the user is attempts to send;
2) Just get it from CommandComplete command tag.

I beleive that the 1) is a 100% bad idea.

PS: this C++11 library is not publicaly available yet, but I hope it will
this year.

--
// Dmitriy.

#4Albe Laurenz
laurenz.albe@wien.gv.at
In reply to: Dmitriy Igrishin (#3)
Re: [HACKERS] Frontend/backend protocol improvements proposal (request).

I'm moving this discussion to -general.

Dmitriy Igrishin wrote:

While developing a C++ client library for Postgres I felt lack of extra
information in command tags in the CommandComplete (B) message [...]
for the following commands:

It seems like bad design to me to keep a list of prepared statements
on the client side when it is already kept on the server side
(accessible with the pg_prepared_statements view).

What's wrong with the following:
If the user wants to deallocate an individual prepared statement,
just send "DEALLOCATE <statement name>" to the server. If the
statement does not exist, the server will return an error.
If the user wants to deallocate all statements, just send
"DEALLOCATE ALL".
Why do you need to track prepared statements on the client side?

Nothing wrong if the user wants to deal with scary and cumbersome code.
As library author, I want to help people make things simpler.

I don't think that anything would change on the user end.

To understand me, please look at the pseudo C++ code below.

// A class designed to work with prepared statements
class Prepared_statement {

public:
// Methods to generate a Bind message, like
Prepared_statement* bind(Position, Value);
// ... and more
// Methods to send Execute message, like
void execute();
void execute_async();
};

class Connection {
public:
// many stuff ...
void close();

Prepared_statement* prepare(Name, Query);
void prepare_async(Statement);

// Make yet another instance of prepared statement.
Prepared_statement* prepared_statement(Name);

// etc.
};

The Connection class is a factory for Prepared_statement instances.
As you can see, the Connection::prepare() returns new instance of
*synchronously* prepared statement. Next, the user can bind values
and execute the statement, like this:

void f(Connection* cn)
{
// Prepare unnamed statement and execute it.
cn->prepare("SELECT $1::text")->bind(0, "Albe")->execute();
// Ps: don't worry about absence of delete; We are using smart pointers :-)
}

But there is a another possible case:

void f(Connection* cn)
{
Prepared_statement* ps = cn->prepare("SELECT $1::text");
cn->close(); // THIS SHOULD invalidate all Prepared_statement instances ...
ps->bind(0, "Albe"); // ... to throw the exception here
}

Attempting to send a bind message over a closed connection
will result in a PostgreSQL error. All you have to do is wrap
that into an exception of your liking.

Moreover, consider:

void f(Connection* cn)
{
Prepared_statement* ps1 = cn->prepare("ps1", "SELECT $1::text");

cn->deallocate("ps1"); // THIS invalidates ps1 object...

Shouldn't that be
cn->deallocate(ps1);
without quotes?

ps1->bind(0, "Albe"); // ... to throw the exception here

Prepared_statement* ps2 = cn->prepare("ps2", "SELECT $1::text");

cn->perform("DEALLOCATE ps2"); // THIS SHOULD ALSO invalidate ps2 object...
ps2->bind(0, "Albe"); // ... to throw the exception here

}

Again, sending a bind message for a deallocated prepared statement
will cause a PostgreSQL error automatically.

In the latter case when the user deallocates named prepared statement directly,
the implementation of Connection can invalidates the prepared statement (ps2) by
analyzing and parsing CommandComplete command tag to get it's name.

And please note, that the user can send DEALLOCATE asynchronously. And there is
only two ways to get the prepared statement (or another session object's) name:
1) Parse the SQL command which the user is attempts to send;
2) Just get it from CommandComplete command tag.

I beleive that the 1) is a 100% bad idea.

PS: this C++11 library is not publicaly available yet, but I hope it will this year.

I still think that it is a bad idea to track this on the client side.

What's wrong with throwing an exception when you get a PostgreSQL error?
If you want to distinguish between certain error conditions,
you can use the SQLSTATE. For example, trying to execute a deallocated
statement would cause SQLSTATE 26000.

Yours,
Laurenz Albe

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#5Dmitriy Igrishin
dmitigr@gmail.com
In reply to: Albe Laurenz (#4)
Re: [HACKERS] Frontend/backend protocol improvements proposal (request).

2013/6/24 Albe Laurenz <laurenz.albe@wien.gv.at>

I'm moving this discussion to -general.

Okay, lets continue here.

Dmitriy Igrishin wrote:

While developing a C++ client library for Postgres I felt lack of extra
information in command tags in the CommandComplete (B) message [...]
for the following commands:

It seems like bad design to me to keep a list of prepared statements
on the client side when it is already kept on the server side
(accessible with the pg_prepared_statements view).

What's wrong with the following:
If the user wants to deallocate an individual prepared statement,
just send "DEALLOCATE <statement name>" to the server. If the
statement does not exist, the server will return an error.
If the user wants to deallocate all statements, just send
"DEALLOCATE ALL".
Why do you need to track prepared statements on the client side?

Nothing wrong if the user wants to deal with scary and cumbersome code.
As library author, I want to help people make things simpler.

I don't think that anything would change on the user end.

But I think so.

To understand me, please look at the pseudo C++ code below.

// A class designed to work with prepared statements
class Prepared_statement {

public:
// Methods to generate a Bind message, like
Prepared_statement* bind(Position, Value);
// ... and more
// Methods to send Execute message, like
void execute();
void execute_async();
};

class Connection {
public:
// many stuff ...
void close();

Prepared_statement* prepare(Name, Query);
void prepare_async(Statement);

// Make yet another instance of prepared statement.
Prepared_statement* prepared_statement(Name);

// etc.
};

The Connection class is a factory for Prepared_statement instances.
As you can see, the Connection::prepare() returns new instance of
*synchronously* prepared statement. Next, the user can bind values
and execute the statement, like this:

void f(Connection* cn)
{
// Prepare unnamed statement and execute it.
cn->prepare("SELECT $1::text")->bind(0, "Albe")->execute();
// Ps: don't worry about absence of delete; We are using smart

pointers :-)

}

But there is a another possible case:

void f(Connection* cn)
{
Prepared_statement* ps = cn->prepare("SELECT $1::text");
cn->close(); // THIS SHOULD invalidate all Prepared_statement

instances ...

ps->bind(0, "Albe"); // ... to throw the exception here
}

Attempting to send a bind message over a closed connection
will result in a PostgreSQL error. All you have to do is wrap
that into an exception of your liking.

Okay, thanks for the info.

Moreover, consider:

void f(Connection* cn)
{
Prepared_statement* ps1 = cn->prepare("ps1", "SELECT $1::text");

cn->deallocate("ps1"); // THIS invalidates ps1 object...

Shouldn't that be
cn->deallocate(ps1);
without quotes?

No, because Connection::deallocate(const string&) considered by me as a
wrapper over
DEALLOCATE SQL command. (As any other SQL command wrapper declared as the
Connection class member.) But it can be overloaded though, but there are
Prepared_statement::deallocate(void) (without arguments) instead.

ps1->bind(0, "Albe"); // ... to throw the exception here

Prepared_statement* ps2 = cn->prepare("ps2", "SELECT $1::text");

cn->perform("DEALLOCATE ps2"); // THIS SHOULD ALSO invalidate ps2

object...

ps2->bind(0, "Albe"); // ... to throw the exception here

}

Again, sending a bind message for a deallocated prepared statement
will cause a PostgreSQL error automatically.

Thats great, but there is a some problem -- the *another* statement with
the same
name (and moreover with same parameters!) can be prepared after
deallocating.
And this can result in bugs. So, the client-side allocated "pointer to the
remote
statement" must be invalidated immediatly after deallocating.

In the latter case when the user deallocates named prepared statement

directly,

the implementation of Connection can invalidates the prepared statement

(ps2) by

analyzing and parsing CommandComplete command tag to get it's name.

And please note, that the user can send DEALLOCATE asynchronously. And

there is

only two ways to get the prepared statement (or another session

object's) name:

1) Parse the SQL command which the user is attempts to send;
2) Just get it from CommandComplete command tag.

I beleive that the 1) is a 100% bad idea.

PS: this C++11 library is not publicaly available yet, but I hope it

will this year.

I still think that it is a bad idea to track this on the client side.

What's wrong with throwing an exception when you get a PostgreSQL error?
If you want to distinguish between certain error conditions,
you can use the SQLSTATE. For example, trying to execute a deallocated
statement would cause SQLSTATE 26000.

See above why it make sense.

// Dmitriy.

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Albe Laurenz (#4)
Re: Re: [HACKERS] Frontend/backend protocol improvements proposal (request).

Albe Laurenz <laurenz.albe@wien.gv.at> writes:

Why do you need to track prepared statements on the client side?

The proposed change would fail to allow that anyway; consider the
possibility of a server-side function doing one or more PREPAREs or
DEALLOCATEs. The command tag would be completely inadequate for
reporting that.

Space is also a problem, since existing clients expect the tags to be
pretty short --- for instance, libpq has always had a hard-wired limit
of 64 bytes (CMDSTATUS_LEN) on what it can store for the tag. That's
not enough for a command name plus a full-length identifier.

If we were to try to do this, we'd need to invent some other reporting
mechanism, perhaps similar to ParameterStatus for GUC_REPORT variables.
But that would be a protocol break, which means it's unlikely to happen
anytime soon.

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#7Dmitriy Igrishin
dmitigr@gmail.com
In reply to: Tom Lane (#6)
Re: Re: [HACKERS] Frontend/backend protocol improvements proposal (request).

2013/6/24 Tom Lane <tgl@sss.pgh.pa.us>

Albe Laurenz <laurenz.albe@wien.gv.at> writes:

Why do you need to track prepared statements on the client side?

The proposed change would fail to allow that anyway; consider the
possibility of a server-side function doing one or more PREPAREs or
DEALLOCATEs. The command tag would be completely inadequate for
reporting that.

Ah, indeed! I does not considered that. Thanks for the info.

Space is also a problem, since existing clients expect the tags to be
pretty short --- for instance, libpq has always had a hard-wired limit
of 64 bytes (CMDSTATUS_LEN) on what it can store for the tag. That's
not enough for a command name plus a full-length identifier.

:-(

If we were to try to do this, we'd need to invent some other reporting
mechanism, perhaps similar to ParameterStatus for GUC_REPORT variables.
But that would be a protocol break, which means it's unlikely to happen
anytime soon.

Is there are chance to add this idea in the TODO?

Btw, maybe we'd need also to identify each prepared statement (and portal)
not only
by the name, but by the automatically generated id included by the backend
in
ParseComplete and then pass this id with Bind messages to let the backend
throws
an error if the prepared statement (portal) is deallocated? (This would be
truly
automatically prepared statement backend tracking as menioned by Albe.)

--
// Dmitriy.

#8Albe Laurenz
laurenz.albe@wien.gv.at
In reply to: Dmitriy Igrishin (#5)
Re: [HACKERS] Frontend/backend protocol improvements proposal (request).

Dmitriy Igrishin wrote:

While developing a C++ client library for Postgres I felt lack of extra
information in command tags in the CommandComplete (B) message [...]

It seems like bad design to me to keep a list of prepared statements
on the client side when it is already kept on the server side
(accessible with the pg_prepared_statements view).

What's wrong with the following:
If the user wants to deallocate an individual prepared statement,
just send "DEALLOCATE <statement name>" to the server. If the
statement does not exist, the server will return an error.
If the user wants to deallocate all statements, just send
"DEALLOCATE ALL".
Why do you need to track prepared statements on the client side?

Thats great, but there is a some problem -- the *another* statement with the same
name (and moreover with same parameters!) can be prepared after deallocating.
And this can result in bugs. So, the client-side allocated "pointer to the remote
statement" must be invalidated immediatly after deallocating.

I understand the problem now.
I pondered a bit over your design, and I came up with a different
idea how to represent prepared statements in a C++ library.

First, a prepared statement is identified by its name.
To make the relationship between a PreparedStatement object
and the PostgreSQL prepared statement unique, I suggest that
the prepared statement name should not be exposed to the
library user. It should be a private property that is
set in the initializer in a unique fashion (for example, it
could be a string representation of the memory address
of the object).
That way, there can never be a name collision. That should take
care of the problem.

Of course somebody could find out what the statement name is and
manually issue a DEALLOCATE, but that would only cause future
use of the prepared statement to fail with an error, which I
think is ok.
Also, if somebody uses SQL PREPARE to create a prepared statement
whose name collides with one of the automatically chosen names,
they get what they deserve in my opinion.
It might be useful to warn users.

I also wouldn't provide a deallocate() method. A deallocated
prepared statement is useless. I think that it would be more
logical to put that into the destructor method.
If somebody wants to get rid of the prepared statement
ahead of time, they can destroy the object.

Does that make sense?

Yours,
Laurenz Albe

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#9Dmitriy Igrishin
dmitigr@gmail.com
In reply to: Albe Laurenz (#8)
Re: [HACKERS] Frontend/backend protocol improvements proposal (request).

2013/6/25 Albe Laurenz <laurenz.albe@wien.gv.at>

Dmitriy Igrishin wrote:

While developing a C++ client library for Postgres I felt lack of

extra

information in command tags in the CommandComplete (B) message [...]

It seems like bad design to me to keep a list of prepared statements
on the client side when it is already kept on the server side
(accessible with the pg_prepared_statements view).

What's wrong with the following:
If the user wants to deallocate an individual prepared statement,
just send "DEALLOCATE <statement name>" to the server. If the
statement does not exist, the server will return an error.
If the user wants to deallocate all statements, just send
"DEALLOCATE ALL".
Why do you need to track prepared statements on the client side?

Thats great, but there is a some problem -- the *another* statement with

the same

name (and moreover with same parameters!) can be prepared after

deallocating.

And this can result in bugs. So, the client-side allocated "pointer to

the remote

statement" must be invalidated immediatly after deallocating.

I understand the problem now.
I pondered a bit over your design, and I came up with a different
idea how to represent prepared statements in a C++ library.

Thanks for thinking about it, Albe!

First, a prepared statement is identified by its name.
To make the relationship between a PreparedStatement object
and the PostgreSQL prepared statement unique, I suggest that
the prepared statement name should not be exposed to the
library user. It should be a private property that is
set in the initializer in a unique fashion (for example, it
could be a string representation of the memory address
of the object).
That way, there can never be a name collision. That should take
care of the problem.

In fact something like was implemented in very early versions of my
library. There are some reasons why I've redesigned the library:

1) If the user does not specify the name of the prepared statement (or
specify it as "") it is considered as unnamed prepared statement -- a one of
the important concepts of the frontend/backend protocol, which is a base of
my current design.
The unnamed prepared statements are very useful since they are deallocated
authomatically when the backend receives the next Parse message with
empty name.

2) Meaningful names of the named prepared statements (as any other database
objects) may be useful while debugging the application. Imagine the memory
addresses (or any other surrogate names) in the Postgres logs...

Hence, the name() method should be public and name().empty() means
unnamed prepared statement.

Of course somebody could find out what the statement name is and
manually issue a DEALLOCATE, but that would only cause future
use of the prepared statement to fail with an error, which I
think is ok.
Also, if somebody uses SQL PREPARE to create a prepared statement
whose name collides with one of the automatically chosen names,
they get what they deserve in my opinion.
It might be useful to warn users.

I also wouldn't provide a deallocate() method. A deallocated
prepared statement is useless. I think that it would be more
logical to put that into the destructor method.
If somebody wants to get rid of the prepared statement
ahead of time, they can destroy the object.

I've also considered this approach and there are some reasons why I don't
implemented the prepared statement class this way:

1) There are Describe message in the protocol. Thus, any prepared statement
can be also described this way:
Prepared_statement* pst1 = connection->describe("name");
Prepared_statement* pst2 = connection->describe("name"); // pst2 points
to the same remote object
Think about the pst as a pointer to the remote object (prepared statement).
Since each statement can be described multiple times, the deleting one of
them
should not result in deallocating the prepared statement by the backend.

2) The best way to inform the user about errors in the modern C++ are
exceptions.
The dellocate operation (as any other query to the database) can be result
in
throwing some exception. But descructors should not throw. (If you are
familiar with
C++ well you should know about the gotchas when destructors throw.)
So, there are deallocate() method which seems to me ok.

Btw, by the reason 2) there are no any transaction RAII classes as in some
other libraries,
because the ROLLBACK command should be executed in the destructor and may
throw.

Does that make sense?

Thanks again for suggestions, Albe!

--
// Dmitriy.

#10Albe Laurenz
laurenz.albe@wien.gv.at
In reply to: Dmitriy Igrishin (#9)
Re: [HACKERS] Frontend/backend protocol improvements proposal (request).

Dmitriy Igrishin wrote:

I understand the problem now.
I pondered a bit over your design, and I came up with a different
idea how to represent prepared statements in a C++ library.

First, a prepared statement is identified by its name.
To make the relationship between a PreparedStatement object
and the PostgreSQL prepared statement unique, I suggest that
the prepared statement name should not be exposed to the
library user. It should be a private property that is
set in the initializer in a unique fashion (for example, it
could be a string representation of the memory address
of the object).
That way, there can never be a name collision. That should take
care of the problem.

In fact something like was implemented in very early versions of my
library. There are some reasons why I've redesigned the library:

1) If the user does not specify the name of the prepared statement (or
specify it as "") it is considered as unnamed prepared statement -- a one of
the important concepts of the frontend/backend protocol, which is a base of
my current design.
The unnamed prepared statements are very useful since they are deallocated
authomatically when the backend receives the next Parse message with
empty name.

If you want unnamed prepared statements in your library, I would
use a different class for them since they behave quite differently.
That would also make this concern go away.

Since there can be only one unnamed prepared statement per
session, there should be only one such object per connection.
It should not get deallocated; maybe it could be private to the
connection, which only offers a "parseUnnamed" and "executeUnnamed"
mathod.

2) Meaningful names of the named prepared statements (as any other database
objects) may be useful while debugging the application. Imagine the memory
addresses (or any other surrogate names) in the Postgres logs...

That wouldn't worry me, but that's a matter of taste.

Hence, the name() method should be public and name().empty() means
unnamed prepared statement.

You could offer a getter for the name if anybody needs it for debugging.

If you really want your users to be able to set prepared statement
names, you'd have to warn them to be careful to avoid the
problem of name collision -- you'd handle the burden to them.
That's of course also a possible way, but I thought you wanted
to avoid that.

I also wouldn't provide a deallocate() method. A deallocated
prepared statement is useless. I think that it would be more
logical to put that into the destructor method.
If somebody wants to get rid of the prepared statement
ahead of time, they can destroy the object.

I've also considered this approach and there are some reasons why I don't
implemented the prepared statement class this way:

1) There are Describe message in the protocol. Thus, any prepared statement
can be also described this way:
Prepared_statement* pst1 = connection->describe("name");
Prepared_statement* pst2 = connection->describe("name"); // pst2 points to the same remote object
Think about the pst as a pointer to the remote object (prepared statement).
Since each statement can be described multiple times, the deleting one of them
should not result in deallocating the prepared statement by the backend.

That seems like bad design to me.
I wouldn't allow different objects pointing to the same prepared
statement. What is the benefit?
Shouldn't the model represent reality?

2) The best way to inform the user about errors in the modern C++ are exceptions.
The dellocate operation (as any other query to the database) can be result in
throwing some exception. But descructors should not throw. (If you are familiar with
C++ well you should know about the gotchas when destructors throw.)
So, there are deallocate() method which seems to me ok.

Of course an error during DEALLOCATE should be ignored in that case.
It's hard to conceive of a case where deallocation fails, but the
connection is fine. And if the connection is closed, the statement
will be deallocated anyway.

Btw, by the reason 2) there are no any transaction RAII classes as in some other libraries,
because the ROLLBACK command should be executed in the destructor and may throw.

I tend to believe that such errors could also be ignored.
If ROLLBACK (or anything else) throws an error, the transaction will
get rolled back anyway.

Yours,
Laurenz Albe

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#11Dmitriy Igrishin
dmitigr@gmail.com
In reply to: Albe Laurenz (#10)
Re: [HACKERS] Frontend/backend protocol improvements proposal (request).

2013/6/26 Albe Laurenz <laurenz.albe@wien.gv.at>

Dmitriy Igrishin wrote:

I understand the problem now.
I pondered a bit over your design, and I came up with a different
idea how to represent prepared statements in a C++ library.

First, a prepared statement is identified by its name.
To make the relationship between a PreparedStatement object
and the PostgreSQL prepared statement unique, I suggest that
the prepared statement name should not be exposed to the
library user. It should be a private property that is
set in the initializer in a unique fashion (for example, it
could be a string representation of the memory address
of the object).
That way, there can never be a name collision. That should take
care of the problem.

In fact something like was implemented in very early versions of my
library. There are some reasons why I've redesigned the library:

1) If the user does not specify the name of the prepared statement (or
specify it as "") it is considered as unnamed prepared statement -- a

one of

the important concepts of the frontend/backend protocol, which is a base

of

my current design.
The unnamed prepared statements are very useful since they are

deallocated

authomatically when the backend receives the next Parse message with
empty name.

If you want unnamed prepared statements in your library, I would
use a different class for them since they behave quite differently.
That would also make this concern go away.

I've considered this approach also on the designing stage, but I dislike
it, because
Named_prepared_statement and Unnamed_prepared_statement, derived from
Prepared_statement, will have exactly the same invariant. The name is a
common
property of all prepared statements.

Since there can be only one unnamed prepared statement per
session, there should be only one such object per connection.
It should not get deallocated; maybe it could be private to the
connection, which only offers a "parseUnnamed" and "executeUnnamed"
mathod.

More precisely, there can be only one uniquely named prepared statement
(named
or unnamed) per session.
Could you provide a signature of parseUnnamed and executeUnnamed please?
I don't clearly understand this approach.

2) Meaningful names of the named prepared statements (as any other

database

objects) may be useful while debugging the application. Imagine the

memory

addresses (or any other surrogate names) in the Postgres logs...

That wouldn't worry me, but that's a matter of taste.

Hence, the name() method should be public and name().empty() means
unnamed prepared statement.

You could offer a getter for the name if anybody needs it for debugging.

If you really want your users to be able to set prepared statement
names, you'd have to warn them to be careful to avoid the
problem of name collision -- you'd handle the burden to them.
That's of course also a possible way, but I thought you wanted
to avoid that.

The mentioned burden is already handled by backend which throws
duplicate_prepared_statement (42P05) error.

I also wouldn't provide a deallocate() method. A deallocated
prepared statement is useless. I think that it would be more
logical to put that into the destructor method.
If somebody wants to get rid of the prepared statement
ahead of time, they can destroy the object.

I've also considered this approach and there are some reasons why I don't
implemented the prepared statement class this way:

1) There are Describe message in the protocol. Thus, any prepared

statement

can be also described this way:
Prepared_statement* pst1 = connection->describe("name");
Prepared_statement* pst2 = connection->describe("name"); // pst2

points to the same remote object

Think about the pst as a pointer to the remote object (prepared

statement).

Since each statement can be described multiple times, the deleting one

of them

should not result in deallocating the prepared statement by the backend.

That seems like bad design to me.
I wouldn't allow different objects pointing to the same prepared
statement. What is the benefit?
Shouldn't the model represent reality?

Well, then the C and C++ languages are bad designed too, because they
allow to have as many pointers to the same as the user like (needs) :-)
Really, I don't see bad design here. Describing prepared statement
multiple times will results in allocating several independent descriptors.
(As with, for example, performing two SELECTs will result in allocating
several independent results by libpq.)
As a bonus, the user can bind different data to independent prepared
statements
objects stepwise (which may be important in some cases) before executing.

2) The best way to inform the user about errors in the modern C++ are

exceptions.

The dellocate operation (as any other query to the database) can be

result in

throwing some exception. But descructors should not throw. (If you are

familiar with

C++ well you should know about the gotchas when destructors throw.)
So, there are deallocate() method which seems to me ok.

Of course an error during DEALLOCATE should be ignored in that case.
It's hard to conceive of a case where deallocation fails, but the
connection is fine. And if the connection is closed, the statement
will be deallocated anyway.

Why this error should be ignored? I believe that this should be decided by
the user.
As a library author I don't know (and cannot know) how to react on such
errors
in the end applications.

Btw, by the reason 2) there are no any transaction RAII classes as in

some other libraries,

because the ROLLBACK command should be executed in the destructor and

may throw.

I tend to believe that such errors could also be ignored.
If ROLLBACK (or anything else) throws an error, the transaction will
get rolled back anyway.

Perhaps, but, again, I don't know how the user will prefer to react. So, I
prefer just
to throw and allow the user to decide.

--
// Dmitriy.

#12Albe Laurenz
laurenz.albe@wien.gv.at
In reply to: Dmitriy Igrishin (#11)
Re: [HACKERS] Frontend/backend protocol improvements proposal (request).

Dmitriy Igrishin wrote:

Since there can be only one unnamed prepared statement per
session, there should be only one such object per connection.
It should not get deallocated; maybe it could be private to the
connection, which only offers a "parseUnnamed" and "executeUnnamed"
mathod.

More precisely, there can be only one uniquely named prepared statement (named
or unnamed) per session.
Could you provide a signature of parseUnnamed and executeUnnamed please?
I don't clearly understand this approach.

I'm just brainstorming.
I'm thinking of something like
void Connection::prepareUnnamed(const char *query,
int nParams,
const Oid *paramTypes);
and
Result Connection::executeUnnamed(int nParams,
const char * const *paramValues,
const int *paramLengths,
const int *paramFormats,
int resultFormat);

But I'm not saying that this is the perfect solution.

If you really want your users to be able to set prepared statement
names, you'd have to warn them to be careful to avoid the
problem of name collision -- you'd handle the burden to them.
That's of course also a possible way, but I thought you wanted
to avoid that.

The mentioned burden is already handled by backend which throws
duplicate_prepared_statement (42P05) error.

I mean the problem that you create a prepared statement,
then issue "DEALLOCATE stmt_name" create a new prepared statement
with the same name and then use the first prepared statement.

Prepared_statement* pst1 = connection->describe("name");
Prepared_statement* pst2 = connection->describe("name"); // pst2 points to the same remote
object

That seems like bad design to me.
I wouldn't allow different objects pointing to the same prepared
statement. What is the benefit?
Shouldn't the model represent reality?

Well, then the C and C++ languages are bad designed too, because they
allow to have as many pointers to the same as the user like (needs) :-)

That's a different thing, because all these pointers contain the same
value. So if pst1 and pst2 represent the same object, I'd like
pst1 == pst2 to be true.

Really, I don't see bad design here. Describing prepared statement
multiple times will results in allocating several independent descriptors.

... but for the same prepared statement.

(As with, for example, performing two SELECTs will result in allocating
several independent results by libpq.)

But those would be two different statement to PostgreSQL, even if the
query strings are identical.

Mind you, I'm not saying that I am the person that decides what is
good taste and what not, I'm just sharing my sentiments.

Of course an error during DEALLOCATE should be ignored in that case.
It's hard to conceive of a case where deallocation fails, but the
connection is fine. And if the connection is closed, the statement
will be deallocated anyway.

Why this error should be ignored? I believe that this should be decided by the user.
As a library author I don't know (and cannot know) how to react on such errors
in the end applications.

Again, I would say that that is a matter of taste.
I just cannot think of a case where this would be important.

Btw, by the reason 2) there are no any transaction RAII classes as in some other libraries,
because the ROLLBACK command should be executed in the destructor and may throw.

I tend to believe that such errors could also be ignored.
If ROLLBACK (or anything else) throws an error, the transaction will
get rolled back anyway.

Perhaps, but, again, I don't know how the user will prefer to react. So, I prefer just
to throw and allow the user to decide.

Agreed, it's a matter of taste.

Yours,
Laurenz Albe

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#13Dmitriy Igrishin
dmitigr@gmail.com
In reply to: Albe Laurenz (#12)
Re: [HACKERS] Frontend/backend protocol improvements proposal (request).

2013/6/28 Albe Laurenz <laurenz.albe@wien.gv.at>

Dmitriy Igrishin wrote:

Since there can be only one unnamed prepared statement per
session, there should be only one such object per connection.
It should not get deallocated; maybe it could be private to the
connection, which only offers a "parseUnnamed" and "executeUnnamed"
mathod.

More precisely, there can be only one uniquely named prepared statement

(named

or unnamed) per session.
Could you provide a signature of parseUnnamed and executeUnnamed please?
I don't clearly understand this approach.

I'm just brainstorming.
I'm thinking of something like
void Connection::prepareUnnamed(const char *query,
int nParams,
const Oid *paramTypes);
and
Result Connection::executeUnnamed(int nParams,
const char * const *paramValues,
const int *paramLengths,
const int *paramFormats,
int resultFormat);

But I'm not saying that this is the perfect solution.

I see. It's more like C-styled design, but not C++.

If you really want your users to be able to set prepared statement
names, you'd have to warn them to be careful to avoid the
problem of name collision -- you'd handle the burden to them.
That's of course also a possible way, but I thought you wanted
to avoid that.

The mentioned burden is already handled by backend which throws
duplicate_prepared_statement (42P05) error.

I mean the problem that you create a prepared statement,
then issue "DEALLOCATE stmt_name" create a new prepared statement
with the same name and then use the first prepared statement.

I see. But I see no problem to invalidate localy allocated descriptors
of the remote prepared statements

Prepared_statement* pst1 = connection->describe("name");
Prepared_statement* pst2 = connection->describe("name"); // pst2

points to the same remote

object

That seems like bad design to me.
I wouldn't allow different objects pointing to the same prepared
statement. What is the benefit?
Shouldn't the model represent reality?

Well, then the C and C++ languages are bad designed too, because they
allow to have as many pointers to the same as the user like (needs) :-)

That's a different thing, because all these pointers contain the same
value. So if pst1 and pst2 represent the same object, I'd like
pst1 == pst2 to be true.

IOW, you want to map localy allocated object to the remote one.
Or course, it is possible. But I dislike approach, since:
a) objects pointed by pst* should be owned by the Connection instance;
b) these objects may have its own state depending on the application
context (e.g., different valuess binded, result format settings etc
etc.)
but it will be impossible due to a).
BTW, according to 18.6.3.1 of "The C++ Programming Language" by Stroustrup,
"The default meaning of < and == for pointers are rarely useful as
comparison criteria
for the objects pointed to." So, I believe that you means that you'd like
*pst1 == *pst2. And yes, I'll agree with it.

Really, I don't see bad design here. Describing prepared statement
multiple times will results in allocating several independent

descriptors.

... but for the same prepared statement.

(As with, for example, performing two SELECTs will result in allocating
several independent results by libpq.)

But those would be two different statement to PostgreSQL, even if the
query strings are identical.

Mind you, I'm not saying that I am the person that decides what is
good taste and what not, I'm just sharing my sentiments.

Don't worry, I'm mature enough to understand this :-) And thank you
for you suggestions!

Of course an error during DEALLOCATE should be ignored in that case.
It's hard to conceive of a case where deallocation fails, but the
connection is fine. And if the connection is closed, the statement
will be deallocated anyway.

Why this error should be ignored? I believe that this should be decided

by the user.

As a library author I don't know (and cannot know) how to react on such

errors

in the end applications.

Again, I would say that that is a matter of taste.
I just cannot think of a case where this would be important.

Maybe, but I'm firmly belive, that errors, provoked by the user, should
not be ignored at all. This is a one of design principles of my library.

Btw, by the reason 2) there are no any transaction RAII classes as in

some other libraries,

because the ROLLBACK command should be executed in the destructor and

may throw.

I tend to believe that such errors could also be ignored.
If ROLLBACK (or anything else) throws an error, the transaction

will

get rolled back anyway.

Perhaps, but, again, I don't know how the user will prefer to react. So,

I prefer just

to throw and allow the user to decide.

Agreed, it's a matter of taste.

Yours,
Laurenz Albe

--
// Dmitriy.

#14Noah Misch
noah@leadboat.com
In reply to: Dmitriy Igrishin (#3)
Re: Frontend/backend protocol improvements proposal (request).

On Fri, Jun 21, 2013 at 12:37:32PM +0400, Dmitriy Igrishin wrote:

2013/6/21 Albe Laurenz <laurenz.albe@wien.gv.at>

Dmitriy Igrishin wrote:

While developing a C++ client library for Postgres I felt lack of extra
information in command tags in the CommandComplete (B) message
for the following commands:
PREPARE;
DEALLOCATE;
DECLARE;
CLOSE;
LISTEN;
UNLISTEN;
SET;
RESET.
Namely, for example, users of my library can prepare statements by using
protocol directly or via PREPARE command. Since the protocol does not
supports prepared statement deallocation, I wrote a wrapper over

DEALLOCATE

command. The library knows about all prepared statements and
invalidates them automatically when user performs deallocate() wrapper.
But users can go with DEALLOCATE command directly and in these cases
I need to query the database to get the list of currently prepared

statements

whenever CommandComplete message with DEALLOCATE command tag
is consumed. Moreover, I need to do it *synchronously* and this breaks
asynchronous API.
I propose to include name of the object in the CommandComplete (B)
message for the above commands.

That would be a change in the protocol, so it's not likely to happen
soon. There is a page where proposed changes to the wire protocol
are collected: http://wiki.postgresql.org/wiki/Todo#Wire_Protocol_Changes

Well, even if this proposal moves to the TODO, it would be nice.

That's worth at least considering when we start to revise the protocol, so I
have added it to the TODO list.

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers