retrieving varchar size
Hello,
I was wondering if it would be possible, in the Postgres backend, to
send back the defined column size for the varchar data type (and
possibly the char() type, i.e., bpchar) on a query? Currently, it just
sends back -1 for the size, which makes it difficult in the frontend
(i.e., odbc driver) to determine what the size of the column is.
Thank you,
Byron
Byron Nikolaidis wrote:
Hello,
I was wondering if it would be possible, in the Postgres backend, to
send back the defined column size for the varchar data type (and
possibly the char() type, i.e., bpchar) on a query? Currently, it just
sends back -1 for the size, which makes it difficult in the frontend
(i.e., odbc driver) to determine what the size of the column is.
While the right solution to this is of course getting the size from
backend, there exists a workaround now (assuming that the query is not
too expensive). While ASCII cursors always hide the varchar sizes,
binary ones return the size in actual data (by zero-padding the
returned data to max size), so one can determine the actual max
sizes by opening the query in binary cursor and then examining
enough records to get one non-null field for each varchar field.
Hannu
Hello,
I was wondering if it would be possible, in the Postgres backend, to
send back the defined column size for the varchar data type (and
possibly the char() type, i.e., bpchar) on a query? Currently, it just
sends back -1 for the size, which makes it difficult in the frontend
(i.e., odbc driver) to determine what the size of the column is.
This is kind of tough to do.
--
Bruce Momjian | 830 Blythe Avenue
maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026
+ If your life is a hard drive, | (610) 353-9879(w)
+ Christ can be your backup. | (610) 853-3000(h)
Byron Nikolaidis wrote:
Hello,
I was wondering if it would be possible, in the Postgres backend, to
send back the defined column size for the varchar data type (and
possibly the char() type, i.e., bpchar) on a query? Currently, it just
sends back -1 for the size, which makes it difficult in the frontend
(i.e., odbc driver) to determine what the size of the column is.While the right solution to this is of course getting the size from
backend, there exists a workaround now (assuming that the query is not
too expensive). While ASCII cursors always hide the varchar sizes,
binary ones return the size in actual data (by zero-padding the
returned data to max size), so one can determine the actual max
sizes by opening the query in binary cursor and then examining
enough records to get one non-null field for each varchar field.
As of 6.3, this is only true of char() fields. Varchar() is now
variable length.
--
Bruce Momjian | 830 Blythe Avenue
maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026
+ If your life is a hard drive, | (610) 353-9879(w)
+ Christ can be your backup. | (610) 853-3000(h)
Bruce Momjian wrote:
Byron Nikolaidis wrote:
Hello,
I was wondering if it would be possible, in the Postgres backend, to
send back the defined column size for the varchar data type (and
possibly the char() type, i.e., bpchar) on a query? Currently, it just
sends back -1 for the size, which makes it difficult in the frontend
(i.e., odbc driver) to determine what the size of the column is.This is kind of tough to do.
What makes it tough?
Is this info not available where needed, or is changing the protocol
tough.
In the latter case, I would suggest an additional SQL command for open
cursors,
or a pseudo table for open cursor where you could do a simple select
statement:
DECLARE CURSOR FOO_CURSOR FOR SELECT * FROM MYTABLE;
SELECT _FIELD_NAME,_FIELD_TYPE,_FIELD_SIZE FROM
FOO_CURSOR_INFO_PSEUTOTABLE;
While the right solution to this is of course getting the size from
backend, there exists a workaround now (assuming that the query is not
too expensive). While ASCII cursors always hide the varchar sizes,
binary ones return the size in actual data (by zero-padding the
returned data to max size), so one can determine the actual max
sizes by opening the query in binary cursor and then examining
enough records to get one non-null field for each varchar field.As of 6.3, this is only true of char() fields. Varchar() is now
variable length.
As knowing field size is quite essential for Borland applications some
solution should be found for this.
Hannu
In the latter case, I would suggest an additional SQL command for open
cursors,
or a pseudo table for open cursor where you could do a simple select
statement:DECLARE CURSOR FOO_CURSOR FOR SELECT * FROM MYTABLE;
SELECT _FIELD_NAME,_FIELD_TYPE,_FIELD_SIZE FROM
FOO_CURSOR_INFO_PSEUTOTABLE;
The information you want is in pg_attribute.atttypmod. It is normally
-1, but is set for char() and varchar() fields, and includes the 4-byte
length. See bin/psql/psql.c for a sample of its use.
--
Bruce Momjian | 830 Blythe Avenue
maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026
+ If your life is a hard drive, | (610) 353-9879(w)
+ Christ can be your backup. | (610) 853-3000(h)
Bruce Momjian wrote:
In the latter case, I would suggest an additional SQL command for open
cursors,
or a pseudo table for open cursor where you could do a simple select
statement:DECLARE CURSOR FOO_CURSOR FOR SELECT * FROM MYTABLE;
SELECT _FIELD_NAME,_FIELD_TYPE,_FIELD_SIZE FROM
FOO_CURSOR_INFO_PSEUTOTABLE;The information you want is in pg_attribute.atttypmod. It is normally
-1, but is set for char() and varchar() fields, and includes the 4-byte
length. See bin/psql/psql.c for a sample of its use.
I see everyone writing in terms of length. You do mean precision, don't
you? For our purposes, this precision should arrive in the result
header. (redundancy in each tuple could be over looked) The goal is to be
able to put realistic bounds on memory allocation before the entire result is
read in. For this to work, functions must also be able to propagate the
their precision.
Did I spell doom to this idea?
David Hartwig wrote:
Bruce Momjian wrote:
In the latter case, I would suggest an additional SQL command for open
cursors,
or a pseudo table for open cursor where you could do a simple select
statement:DECLARE CURSOR FOO_CURSOR FOR SELECT * FROM MYTABLE;
SELECT _FIELD_NAME,_FIELD_TYPE,_FIELD_SIZE FROM
FOO_CURSOR_INFO_PSEUTOTABLE;The information you want is in pg_attribute.atttypmod. It is normally
-1, but is set for char() and varchar() fields, and includes the 4-byte
length. See bin/psql/psql.c for a sample of its use.
is this on client side or server side?
Last time I checked (it was in 6.2 protocol) it was not sent to client.
What I need is the defined max length of varchar (or char), not just
actual length of each field of that type. This is used by Borlands BDE,
and if this changes, depending on the where clause, it breaks BDE.
I see everyone writing in terms of length. You do mean precision, don't
you?
in case varchars have precision, yes ;)
For our purposes, this precision should arrive in the result
header. (redundancy in each tuple could be over looked) The goal is to be
able to put realistic bounds on memory allocation before the entire result is
read in. For this to work, functions must also be able to propagate the
their precision.
Yes, the functions should behave as objects, so that you can get
metadata on them.
So functions should know, depending on max lengths of their arguments,
how long strings they return.
But even without this functionality, having this info is essential to
getting Borland stuff to work.
Did I spell doom to this idea?
I hope not.
Hannu
The information you want is in pg_attribute.atttypmod. It is normally
-1, but is set for char() and varchar() fields, and includes the 4-byte
length. See bin/psql/psql.c for a sample of its use.I see everyone writing in terms of length. You do mean precision, don't
you? For our purposes, this precision should arrive in the result
header. (redundancy in each tuple could be over looked) The goal is to be
able to put realistic bounds on memory allocation before the entire result is
read in. For this to work, functions must also be able to propagate the
their precision.Did I spell doom to this idea?
Hmm. The problem is that many of us use the old 'text' type, which
doesn't have a defined length. Not sure how to handle this in a
portable libpq way?
--
Bruce Momjian | 830 Blythe Avenue
maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026
+ If your life is a hard drive, | (610) 353-9879(w)
+ Christ can be your backup. | (610) 853-3000(h)
is this on client side or server side?
Last time I checked (it was in 6.2 protocol) it was not sent to client.
What I need is the defined max length of varchar (or char), not just
actual length of each field of that type. This is used by Borlands BDE,
and if this changes, depending on the where clause, it breaks BDE.
Can't you do:
select atttypmod from pg_attribute
where attrelid = 10003 and attname = 'col1';
That will give the length + 4 bytes.
--
Bruce Momjian | 830 Blythe Avenue
maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026
+ If your life is a hard drive, | (610) 353-9879(w)
+ Christ can be your backup. | (610) 853-3000(h)
Bruce Momjian wrote:
Can't you do:
select atttypmod from pg_attribute
where attrelid = 10003 and attname = 'col1';That will give the length + 4 bytes.
The problem with that theory is this. If the frontend application just
executes some random query, such as "select * from table", you really do not
know anything about what is coming back. You must rely on the little bit of
information the protocol gives you. In the case of Postgres, it gives you
the fieldname, datatype, and size for each column in the result.
Unfortunately, for varchar and char(n), the size reports -1. This is not
very helpful for describing the result set.
Your above example works fine (in fact we use that already) when you know the
table and column name, as in metadata functions such as SQLColumns() in the
ODBC driver.
Byron
The problem with that theory is this. If the frontend application just
executes some random query, such as "select * from table", you really do not
know anything about what is coming back. You must rely on the little bit of
information the protocol gives you. In the case of Postgres, it gives you
the fieldname, datatype, and size for each column in the result.
Unfortunately, for varchar and char(n), the size reports -1. This is not
very helpful for describing the result set.Your above example works fine (in fact we use that already) when you know the
table and column name, as in metadata functions such as SQLColumns() in the
ODBC driver.
Yep. We could pass back atttypmod as part of the PGresult. I can add
that to the TODO list. Would that help?
--
Bruce Momjian | 830 Blythe Avenue
maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026
+ If your life is a hard drive, | (610) 353-9879(w)
+ Christ can be your backup. | (610) 853-3000(h)
Bruce Momjian wrote:
The problem with that theory is this. If the frontend application just
executes some random query, such as "select * from table", you really do not
know anything about what is coming back. You must rely on the little bit of
information the protocol gives you. In the case of Postgres, it gives you
the fieldname, datatype, and size for each column in the result.
Unfortunately, for varchar and char(n), the size reports -1. This is not
very helpful for describing the result set.Your above example works fine (in fact we use that already) when you know the
table and column name, as in metadata functions such as SQLColumns() in the
ODBC driver.Yep. We could pass back atttypmod as part of the PGresult. I can add
that to the TODO list. Would that help?
Yes, that would do it!
Thank you for listening to our ravings on this issue.
Byron
Bruce Momjian wrote:
The problem with that theory is this. If the frontend application just
executes some random query, such as "select * from table", you really do not
know anything about what is coming back. You must rely on the little bit of
information the protocol gives you. In the case of Postgres, it gives you
the fieldname, datatype, and size for each column in the result.
Unfortunately, for varchar and char(n), the size reports -1. This is not
very helpful for describing the result set.Your above example works fine (in fact we use that already) when you know the
table and column name, as in metadata functions such as SQLColumns() in the
ODBC driver.Yep. We could pass back atttypmod as part of the PGresult. I can add
that to the TODO list. Would that help?Yes, that would do it!
Thank you for listening to our ravings on this issue.
Added to TODO:
* Add pg_attribute.atttypmod/Resdom->restypmod to PGresult structure
This is a good suggestion.
--
Bruce Momjian | 830 Blythe Avenue
maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026
+ If your life is a hard drive, | (610) 353-9879(w)
+ Christ can be your backup. | (610) 853-3000(h)
Bruce Momjian <maillist@candle.pha.pa.us> writes:
Added to TODO:
* Add pg_attribute.atttypmod/Resdom->restypmod to PGresult structure
This is a good suggestion.
This will require a frontend/backend protocol change, no?
If so, right now would be a great time to address it; I'm about halfway
through rewriting libpq for the asynchronous-query support we discussed
last week, and would be happy to make the client-side mods while I still
have the code in my head.
As long as we are opening up the protocol, there is an incredibly grotty
hack in libpq that I'd like to get rid of. It's hard for me to be
sure whether it's even necessary, but: when libpq gets a 'C' response
(which the documentation says is a "completed response") it assumes that
this is *not* the end of the transaction, and that the only way to be
sure that everything's been read is to send an empty query and wait for
the empty query's 'I' response to be returned.
case 'C': /* portal query command, no rows returned */
/*
* since backend may produce more than one result
* for some commands need to poll until clear.
* Send an empty query down, and keep reading out of
* the pipe until an 'I' is received.
*/
Does this ring a bell with anyone? I'm prepared to believe that it's
useless code, but have no easy way to be sure.
Needless to say, if there really is an ambiguity then the *right* answer
is to fix the protocol so that the end of a query/response cycle is
unambiguously determinable. It looks to me like this hack is costing us
an extra round trip to the server for every ordinary query. That sucks.
regards, tom lane
Import Notes
Reply to msg id not found: YourmessageofFri24Apr1998000530-0400199804240405.AAA25030@candle.pha.pa.us | Resolved by subject fallback
Yes, it rings a bell alright,
When you execute a multiple query (denoted by semicolans) like "set geqo to
'off'; show datestyle; select * from table", you get that multiple returns and
MUST read until you get the 'I'. If you don't, your screwed the next time you
try and read anything cause all that stuff is still in the pipe.
Question though, I didnt think my request would have caused a major protocol
change. I though that the '-1' would simply be replaced by the correct size?
Byron
Tom Lane wrote:
Show quoted text
Bruce Momjian <maillist@candle.pha.pa.us> writes:
Added to TODO:
* Add pg_attribute.atttypmod/Resdom->restypmod to PGresult structure
This is a good suggestion.This will require a frontend/backend protocol change, no?
If so, right now would be a great time to address it; I'm about halfway
through rewriting libpq for the asynchronous-query support we discussed
last week, and would be happy to make the client-side mods while I still
have the code in my head.As long as we are opening up the protocol, there is an incredibly grotty
hack in libpq that I'd like to get rid of. It's hard for me to be
sure whether it's even necessary, but: when libpq gets a 'C' response
(which the documentation says is a "completed response") it assumes that
this is *not* the end of the transaction, and that the only way to be
sure that everything's been read is to send an empty query and wait for
the empty query's 'I' response to be returned.case 'C': /* portal query command, no rows returned */
/*
* since backend may produce more than one result
* for some commands need to poll until clear.
* Send an empty query down, and keep reading out of
* the pipe until an 'I' is received.
*/Does this ring a bell with anyone? I'm prepared to believe that it's
useless code, but have no easy way to be sure.Needless to say, if there really is an ambiguity then the *right* answer
is to fix the protocol so that the end of a query/response cycle is
unambiguously determinable. It looks to me like this hack is costing us
an extra round trip to the server for every ordinary query. That sucks.regards, tom lane
Byron Nikolaidis <byronn@insightdist.com> writes:
Yes, it rings a bell alright, When you execute a multiple query
(denoted by semicolans) like "set geqo to 'off'; show datestyle;
select * from table", you get that multiple returns and MUST read
until you get the 'I'. If you don't, your screwed the next time you
try and read anything cause all that stuff is still in the pipe.
That seems pretty bogus. What happens if you do
select * from table1; select * from table2
? The way the code in libpq looks, I think the response from the
first select would get lost entirely (probably even cause a memory
leak). It's not set up to handle receipt of more than one command
response in any clean fashion. We'd need to revise the application
API to make that work right.
Playing around with psql, it seems that you can't actually get psql
to submit a multi-command line as a single query; it seems to break
it up into separate queries. Which is what libpq can cope with.
I think we should either forbid multiple commands per PQexec call,
or fix libpq to handle them properly (and hence be able to return
a series of PGresults, not just one).
Question though, I didnt think my request would have caused a major
protocol change. I though that the '-1' would simply be replaced by
the correct size?
I assumed we'd want to add the restypmod as a new field in PGresult
and in the protocol. But I'm just a newbie.
regards, tom lane
Import Notes
Reply to msg id not found: YourmessageofFri24Apr1998111226-04003540ABDA.2C62A589@insightdist.com | Resolved by subject fallback
Yes, it rings a bell alright,
When you execute a multiple query (denoted by semicolans) like "set geqo to
'off'; show datestyle; select * from table", you get that multiple returns and
MUST read until you get the 'I'. If you don't, your screwed the next time you
try and read anything cause all that stuff is still in the pipe.
Good point. If we don't send the empty query, the queued up results get
out of sync with the requests.
One solution is to handle it the way psql does. It keeps track of the
quotes, backslashes, and semicolons in the input string, and sends just
one query each time to the backend, and prints the results.
Now, with libpq, I think the proper solution would be to scan the input
string, and count the number of queries being send, send the whole
strings (with the multiple queries) and retrieve that many answers from
the backend, discarding all but the last result. If you do that, I can
remove the stuff from psql.c.
Question though, I didnt think my request would have caused a major protocol
change. I though that the '-1' would simply be replaced by the correct size?
Well, the -1 is in attlen, which is the type length. text, char,
varchar are all varlena(variable length)/-1. atttypmod is the length
specified at attribute creation time. It is similar, but not the same
as the length, and trying to put the typmod in the length field really
messes up the clarity of what is going on. We added atttypmod to
clarify the code in the backend, and it should be sent to the front end.
Soon, maybe will have atttypmod specifiying the precision of DECIMAL, or
currency of MONEY.
As far as adding atttypmod to libpq, I say do it. If you look in the
backend's BeginCommand(), under the Remote case label, you will see it
sending the atttypid to the front end, using the TupleDesc that was
passed to it. Just after sending the atttyplen, I can send the
atttypmod value, which is an int16. I can do all the backend changes.
There are a few places where this would have to be changed in the
backend.
Other front-end libraries reading this protocol will have to change to
to accept this field.
--
Bruce Momjian | 830 Blythe Avenue
maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026
+ If your life is a hard drive, | (610) 353-9879(w)
+ Christ can be your backup. | (610) 853-3000(h)
Byron Nikolaidis <byronn@insightdist.com> writes:
Yes, it rings a bell alright, When you execute a multiple query
(denoted by semicolans) like "set geqo to 'off'; show datestyle;
select * from table", you get that multiple returns and MUST read
until you get the 'I'. If you don't, your screwed the next time you
try and read anything cause all that stuff is still in the pipe.That seems pretty bogus. What happens if you do
select * from table1; select * from table2
? The way the code in libpq looks, I think the response from the
first select would get lost entirely (probably even cause a memory
leak). It's not set up to handle receipt of more than one command
response in any clean fashion. We'd need to revise the application
API to make that work right.
Playing around with psql, it seems that you can't actually get psql
to submit a multi-command line as a single query; it seems to break
it up into separate queries. Which is what libpq can cope with.
Yep, you figured it out. (See earlier posting.)
I have now thought about the problem some more, and I think an even
better solution would be that if the backend receives multiple commands
in a single query, it just returns the first or last result. There is
no mechanism in libpq to send a query and get multiple results back, so
why not just return one result.
No need to cound the number of queries sent, and no reason to send empty
queries to the backend looking for the last result.
If you want me to do this for the backend, let me know and I will do it.
First or last result? What do we return now?
I think we should either forbid multiple commands per PQexec call,
or fix libpq to handle them properly (and hence be able to return
a series of PGresults, not just one).Question though, I didnt think my request would have caused a major
protocol change. I though that the '-1' would simply be replaced by
the correct size?I assumed we'd want to add the restypmod as a new field in PGresult
and in the protocol. But I'm just a newbie.
restypmod may not be available at the time of returning the result, but
the TupleDesc is, and it has the proper atttypmod.
--
Bruce Momjian | 830 Blythe Avenue
maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026
+ If your life is a hard drive, | (610) 353-9879(w)
+ Christ can be your backup. | (610) 853-3000(h)
On Sat, 25 Apr 1998, Bruce Momjian wrote:
Yes, it rings a bell alright,
When you execute a multiple query (denoted by semicolans) like "set geqo to
'off'; show datestyle; select * from table", you get that multiple returns and
MUST read until you get the 'I'. If you don't, your screwed the next time you
try and read anything cause all that stuff is still in the pipe.Good point. If we don't send the empty query, the queued up results get
out of sync with the requests.One solution is to handle it the way psql does. It keeps track of the
quotes, backslashes, and semicolons in the input string, and sends just
one query each time to the backend, and prints the results.Now, with libpq, I think the proper solution would be to scan the input
string, and count the number of queries being send, send the whole
strings (with the multiple queries) and retrieve that many answers from
the backend, discarding all but the last result. If you do that, I can
remove the stuff from psql.c.
I think for libpq, that would be a good idea, but it would mean that there
is a difference in behaviour between the interfaces.
The JDBC spec allows for multiple ResultSet's to be returned from a query,
and our driver handles this already.
Now is this the client libpq, or the backend libpq you are thinking of
changing? If it's the backend one, then this will break JDBC with multiple
result sets.
Question though, I didnt think my request would have caused a major protocol
change. I though that the '-1' would simply be replaced by the correct size?Well, the -1 is in attlen, which is the type length. text, char,
varchar are all varlena(variable length)/-1. atttypmod is the length
specified at attribute creation time. It is similar, but not the same
as the length, and trying to put the typmod in the length field really
messes up the clarity of what is going on. We added atttypmod to
clarify the code in the backend, and it should be sent to the front end.
Soon, maybe will have atttypmod specifiying the precision of DECIMAL, or
currency of MONEY.
That would be useful.
As far as adding atttypmod to libpq, I say do it. If you look in the
backend's BeginCommand(), under the Remote case label, you will see it
sending the atttypid to the front end, using the TupleDesc that was
passed to it. Just after sending the atttyplen, I can send the
atttypmod value, which is an int16. I can do all the backend changes.
There are a few places where this would have to be changed in the
backend.Other front-end libraries reading this protocol will have to change to
to accept this field.
As soon as you do it, I'll convert JDBC.
--
Peter T Mount peter@retep.org.uk or petermount@earthling.net
Main Homepage: http://www.demon.co.uk/finder (moving soon to www.retep.org.uk)
************ Someday I may rebuild this signature completely ;-) ************
Work Homepage: http://www.maidstone.gov.uk Work EMail: peter@maidstone.gov.uk