Re: [DOCS] Re: FE/BE protocol revision patch

Started by Thomas G. Lockhartover 27 years ago13 messages
#1Thomas G. Lockhart
lockhart@alumni.caltech.edu

1. Implement addition of atttypmod field to RowDescriptor messages.
The client-side code is there but ifdef'd out. I have no idea
what to change on the backend side. The field should be sent
only if protocol >= 2.0, of course.

Hmm. I was hoping to do something in the backend to allow data types
like numeric(p,s) which take multiple qualifying arguments (in this
case, precision and scale). One possibility was to shoehorn both fields
into the existing atttypmod 16-bit field.

Seems like atttypmod is now being used for things outside of the
backend, but I'm not sure how to support these other uses with these
other possible data types.

A better general approach to the type qualifier problem might be to
define a variable-length data type which specifies column
characteristics, and then pass that around. For character strings, it
would have one field, and for numeric() and decimal() it would have two.

Comments? Ideas??

- Tom

#2Bruce Momjian
maillist@candle.pha.pa.us
In reply to: Thomas G. Lockhart (#1)

1. Implement addition of atttypmod field to RowDescriptor messages.
The client-side code is there but ifdef'd out. I have no idea
what to change on the backend side. The field should be sent
only if protocol >= 2.0, of course.

Hmm. I was hoping to do something in the backend to allow data types
like numeric(p,s) which take multiple qualifying arguments (in this
case, precision and scale). One possibility was to shoehorn both fields
into the existing atttypmod 16-bit field.

Seems like atttypmod is now being used for things outside of the
backend, but I'm not sure how to support these other uses with these
other possible data types.

We are just passing it back. There is no special handling of atttypmod
that you need to worry about. I think we pass it back to Openlink can
know the actual length of the char() and varchar() fields without doing
a dummy select. However, they better know it is a char()/varchar()
field before using it for such a purpose, because it could be used from
something else later on, as you suggest.

A better general approach to the type qualifier problem might be to
define a variable-length data type which specifies column
characteristics, and then pass that around. For character strings, it
would have one field, and for numeric() and decimal() it would have two.

-- 
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)
#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#2)

"Thomas G. Lockhart" <lockhart@alumni.caltech.edu> writes:

A better general approach to the type qualifier problem might be to
define a variable-length data type which specifies column
characteristics, and then pass that around. For character strings, it
would have one field, and for numeric() and decimal() it would have two.

... and for ordinary column datatypes of fixed properties, it needn't
have *any* fields. That would more than pay for the space cost of
supporting a variable-width data type, I bet. I like this.

Once atttypmod is exposed to applications it will be much harder to
change its representation or meaning, so I'd suggest getting this right
before 6.4 comes out. If that doesn't seem feasible, I think I'd even
vote for backing out the change that makes atttypmod visible until it
can be done right.

regards, tom lane

#4Thomas G. Lockhart
lockhart@alumni.caltech.edu
In reply to: Tom Lane (#3)
Re: [HACKERS] Re: [DOCS] Re: FE/BE protocol revision patch

Once atttypmod is exposed to applications it will be much harder to
change its representation or meaning

Yeah, that is what I'm worried about too...

- Tom (the other "tgl")

#5Bruce Momjian
maillist@candle.pha.pa.us
In reply to: Thomas G. Lockhart (#4)
Re: [HACKERS] Re: [DOCS] Re: FE/BE protocol revision patch

Once atttypmod is exposed to applications it will be much harder to
change its representation or meaning

Yeah, that is what I'm worried about too...

- Tom (the other "tgl")

Well, atttypmod is stored in various C structures, like Resdom, so we
would need some C representation for the type, even it is just a void *.

Very few system columns are varlena/text, and for good reason, perhaps.

Such a change is certainly going to make things in the backend slightly
harder, so please let me what advantage a varlena atttypmod is going to
have.

Zero overhead for types that don't use it is meaningless, because the
varlena length is 4 bytes, while current atttypmod is only two. Second,
I don't see how a varlena makes atttypmod less type-specific. We
currently return a -1 when it is not being used.

-- 
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)
#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#5)
Re: [HACKERS] Re: [DOCS] Re: FE/BE protocol revision patch

Bruce Momjian <maillist@candle.pha.pa.us> writes:

Zero overhead for types that don't use it is meaningless, because the
varlena length is 4 bytes, while current atttypmod is only two. Second,
I don't see how a varlena makes atttypmod less type-specific.

Well, the issue is making sure that it will be adequate for future
datatypes that we can't foresee.

I can see that a variable-size atttypmod might be a tad painful to
support. If you don't want to go that far, a reasonable compromise
would be to make it int4 instead of int2. int2 is already uncomfortably
tight for the numeric/decimal datatypes, which we surely will want to
support soon (at least I do ;-)). int4 should give a little breathing
room for datatypes that need to encode more than one subfield into
atttypmod.

regards, tom lane

#7Bruce Momjian
maillist@candle.pha.pa.us
In reply to: Tom Lane (#6)
Re: [HACKERS] Re: [DOCS] Re: FE/BE protocol revision patch

Bruce Momjian <maillist@candle.pha.pa.us> writes:

Zero overhead for types that don't use it is meaningless, because the
varlena length is 4 bytes, while current atttypmod is only two. Second,
I don't see how a varlena makes atttypmod less type-specific.

Well, the issue is making sure that it will be adequate for future
datatypes that we can't foresee.

I can see that a variable-size atttypmod might be a tad painful to
support. If you don't want to go that far, a reasonable compromise
would be to make it int4 instead of int2. int2 is already uncomfortably
tight for the numeric/decimal datatypes, which we surely will want to
support soon (at least I do ;-)). int4 should give a little breathing
room for datatypes that need to encode more than one subfield into
atttypmod.

Comments? I am willing to change it.

-- 
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)
#8Andreas Zeugswetter
andreas.zeugswetter@telecom.at
In reply to: Bruce Momjian (#7)
Re: [HACKERS] Re: [DOCS] Re: FE/BE protocol revision patch

.. and for ordinary column datatypes of fixed properties, it needn't
have *any* fields. That would more than pay for the space cost of
supporting a variable-width data type, I bet. I like this.

Actually not, since attypmod is stored with the table definition, it does not waste any space
on a per tuple basis. So I think the correct solution would rather be to extend the atttypmod idea
(maybe make atttypmod an array). Maybe we should add a atttypformat field of type varchar()
(this could be used for language and the like).

It would be rather bad to convert fixed length fields into varlena, since varlena costs a lot
during tuple access. The cheapest rows are those that have an overall fixed length.
So I think it is best to store as much info with the table definition as possible.

Once atttypmod is exposed to applications it will be much harder to
change its representation or meaning, so I'd suggest getting this right
before 6.4 comes out. If that doesn't seem feasible, I think I'd even
vote for backing out the change that makes atttypmod visible until it
can be done right.

atttypmod is the right direction, it only currently lacks extendability.

Andreas

#9Noname
dg@illustra.com
In reply to: Bruce Momjian (#7)
Re: [HACKERS] Re: [DOCS] Re: FE/BE protocol revision patch

Again, old news, but am wading through my backlog.

Bruce Momjian and Tom Lane are discussing atttypmod, its uses, and prospects:

Bruce Momjian <maillist@candle.pha.pa.us> writes:

Zero overhead for types that don't use it is meaningless, because the
varlena length is 4 bytes, while current atttypmod is only two. Second,
I don't see how a varlena makes atttypmod less type-specific.

Well, the issue is making sure that it will be adequate for future
datatypes that we can't foresee.

I can see that a variable-size atttypmod might be a tad painful to
support. If you don't want to go that far, a reasonable compromise
would be to make it int4 instead of int2. int2 is already uncomfortably
tight for the numeric/decimal datatypes, which we surely will want to
support soon (at least I do ;-)). int4 should give a little breathing
room for datatypes that need to encode more than one subfield into
atttypmod.

Comments? I am willing to change it.

An int 4 atttypmod should be fine. A bit of overhead perhaps, but who
quibles about a few bytes these days? And, perhaps there is a use.

Andreas Zeugswetter <andreas.zeugswetter@telecom.at> add to the discussion:

Once atttypmod is exposed to applications it will be much harder to
change its representation or meaning, so I'd suggest getting this right
before 6.4 comes out. If that doesn't seem feasible, I think I'd even
vote for backing out the change that makes atttypmod visible until it
can be done right.

atttypmod is the right direction, it only currently lacks extendability.

Andreas

But, I think a line needs to be drawn. There is no way to forsee all the
possible uses to cover all future extendibility within the protocol. But,
the protocol should not be responsible for this anyway, that is really
the role of type implementation.

Right now the protocol supports some types (char, int, float etc) in a
special way. And it provides for composites. But it doesn't (and no-one
is arguing that it should) support images or sounds or timeseries in a
special way. The type itself has to handle that chore. All the protocol
really should do is provide a way to find the size and type of a value.
Which it does.

Numeric is a kind of borderline case. I think a perfectly good numeric
implementation could be made using varlenas to hold binary representations
of infinite precision scaled integers with precision and scale embedded in
the data. But, Numeric is an SQL92 type, and it is very common in SQL
applications and so the extra convenience of built-in support in the
protocol is probably justified. And, Numeric suport is something we know
about the need for now.

But, I don't think that spending a lot of effort or complicating the backend
code to support currently unknown and undefined possible future extensibility
is worthwhile.

My opinion only, but every project I have seen that started to get serious
about predicting future requireements ended up failing to meet known current
requirements.

-dg

David Gould dg@illustra.com 510.628.3783 or 510.305.9468
Informix Software 300 Lakeside Drive Oakland, CA 94612
- A child of five could understand this! Fetch me a child of five.

#10Bruce Momjian
maillist@candle.pha.pa.us
In reply to: Noname (#9)
Re: [HACKERS] Re: [DOCS] Re: FE/BE protocol revision patch

Comments? I am willing to change it.

An int 4 atttypmod should be fine. A bit of overhead perhaps, but who
quibles about a few bytes these days? And, perhaps there is a use.

Yea, no one commented, so it stays an int2 until someone finds a type
that needs more than a two-byte atttypmod. Right now, it fits the need.

-- 
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)
#11Thomas G. Lockhart
lockhart@alumni.caltech.edu
In reply to: Bruce Momjian (#10)
Re: [HACKERS] Re: [DOCS] Re: FE/BE protocol revision patch

Comments? I am willing to change it.

An int 4 atttypmod should be fine. A bit of overhead perhaps, but
who quibles about a few bytes these days? And, perhaps there is a
use.

Yea, no one commented, so it stays an int2 until someone finds a type
that needs more than a two-byte atttypmod. Right now, it fits the
need.

Well, I didn't comment because I haven't yet worked out the issues. But
I'll go with Bruce's and David's inclination that we should shoehorn
numeric()/decimal() into something like the existing atttypmod field
rather than trying for "the general solution" which btw isn't obvious
how to do.

However, I don't think that 16 bits vs 32 bits is an issue at all
performance-wise, and I'd to see atttypmod go to 32 bits just to give a
little breathing room. I'm already using int32 to send attypmod to the
new char/varchar sizing functions.

Can we go to int32 on atttypmod? I'll try to break it up into two
sub-fields to implement numeric().

btw, anyone know of a package for variable- and large-precision
numerics? I have looked at the GNU gmp package, but it looks to me that
it probably won't fit into the db backend without lots of overhead. Will
probably try to use the int64 package in contrib for now...

- Tom

#12Noname
dg@illustra.com
In reply to: Thomas G. Lockhart (#11)
Re: [HACKERS] Re: [DOCS] Re: FE/BE protocol revision patch

Comments? I am willing to change it.

An int 4 atttypmod should be fine. A bit of overhead perhaps, but
who quibles about a few bytes these days? And, perhaps there is a
use.

Yea, no one commented, so it stays an int2 until someone finds a type
that needs more than a two-byte atttypmod. Right now, it fits the
need.

Well, I didn't comment because I haven't yet worked out the issues. But
I'll go with Bruce's and David's inclination that we should shoehorn
numeric()/decimal() into something like the existing atttypmod field
rather than trying for "the general solution" which btw isn't obvious
how to do.

However, I don't think that 16 bits vs 32 bits is an issue at all
performance-wise, and I'd to see atttypmod go to 32 bits just to give a
little breathing room. I'm already using int32 to send attypmod to the
new char/varchar sizing functions.

Can we go to int32 on atttypmod? I'll try to break it up into two
sub-fields to implement numeric().

btw, anyone know of a package for variable- and large-precision
numerics? I have looked at the GNU gmp package, but it looks to me that
it probably won't fit into the db backend without lots of overhead. Will
probably try to use the int64 package in contrib for now...

- Tom

Int32 is fine with me. Or maybe uint32? Or maybe

union {
u uint32;
struct {
h int16;
l int16;
}
}

Oh no, it is happening again....

Lets just go with uint32.

-dg

David Gould dg@illustra.com 510.628.3783 or 510.305.9468
Informix Software (No, really) 300 Lakeside Drive Oakland, CA 94612
"Of course, someone who knows more about this will correct me if I'm wrong,
and someone who knows less will correct me if I'm right."
--David Palmer (palmer@tybalt.caltech.edu)

#13Bruce Momjian
maillist@candle.pha.pa.us
In reply to: Noname (#12)
Re: [HACKERS] Re: [DOCS] Re: FE/BE protocol revision patch

Well, I didn't comment because I haven't yet worked out the issues. But
I'll go with Bruce's and David's inclination that we should shoehorn
numeric()/decimal() into something like the existing atttypmod field
rather than trying for "the general solution" which btw isn't obvious
how to do.

However, I don't think that 16 bits vs 32 bits is an issue at all
performance-wise, and I'd to see atttypmod go to 32 bits just to give a
little breathing room. I'm already using int32 to send attypmod to the
new char/varchar sizing functions.

Can we go to int32 on atttypmod? I'll try to break it up into two
sub-fields to implement numeric().

btw, anyone know of a package for variable- and large-precision
numerics? I have looked at the GNU gmp package, but it looks to me that
it probably won't fit into the db backend without lots of overhead. Will
probably try to use the int64 package in contrib for now...

- Tom

Int32 is fine with me. Or maybe uint32? Or maybe

union {
u uint32;
struct {
h int16;
l int16;
}
}

Oh no, it is happening again....

Lets just go with uint32.

Can't be unsigned. -1 must be a valid value.

-- 
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)