Is there any plan to add unsigned integer types?

Started by crocketover 14 years ago6 messages

crockabiscuit@gmail.com

over 14 years ago

MySQL already has unsigned INT type, and it has double the range of
signed INT type.
It's not just the bigger range that UINT type brings.
If unsigned INT type exists, I wouldn't have to execute "create domain
UINT" in every database.

If "INT unsigned" and "SERIAL unsigned" exist, PostgreSQL would bring
more convenience to users.

Peter Eisentraut

peter_e@gmx.net

over 14 years ago

In reply to: crocket (#1)

Re: Is there any plan to add unsigned integer types?

On mån, 2011-09-26 at 19:41 +0900, crocket wrote:

MySQL already has unsigned INT type, and it has double the range of
signed INT type.
It's not just the bigger range that UINT type brings.
If unsigned INT type exists, I wouldn't have to execute "create domain
UINT" in every database.

If "INT unsigned" and "SERIAL unsigned" exist, PostgreSQL would bring
more convenience to users.

I believe there have been many discussions about this in the past,
outlining the various issues that would come with this project. A first
step would be to start implementing this in user space and see how much
breaks.

mmoncure@gmail.com

over 14 years ago

In reply to: crocket (#1)

Re: Is there any plan to add unsigned integer types?

On Mon, Sep 26, 2011 at 5:41 AM, crocket <crockabiscuit@gmail.com> wrote:

MySQL already has unsigned INT type, and it has double the range of
signed INT type.
It's not just the bigger range that UINT type brings.
If unsigned INT type exists, I wouldn't have to execute "create domain
UINT" in every database.

If "INT unsigned" and "SERIAL unsigned" exist, PostgreSQL would bring
more convenience to users.

This comes up now and then. The problem is the benefit gained is not
really worth the pain. In today's 64 bit world, choosing a 64 bit int
nails the cases where you need the extra range and you have the
ability to use constraints (if necessary, through a domain) to enforce
correctness.

On the downside, you have to add a lot of casts, overloads, etc.
Figuring out the casting rules is non trivial and could lead to
surprising behaviors...inferring the type of 'unknown' strings is bad
enough as it is.

TBH, what I'd greatly prefer to see is to have domains be finished up
so that you don't have to carefully consider their use (for example,
you can't make arrays out of them). Then an unsigned int could simply
be:

create domain uint as bigint check (value >= 0);

merlin

robertmhaas@gmail.com

over 14 years ago

In reply to: Merlin Moncure (#3)

Re: Is there any plan to add unsigned integer types?

On Mon, Sep 26, 2011 at 10:02 AM, Merlin Moncure <mmoncure@gmail.com> wrote:

On Mon, Sep 26, 2011 at 5:41 AM, crocket <crockabiscuit@gmail.com> wrote:

MySQL already has unsigned INT type, and it has double the range of
signed INT type.
It's not just the bigger range that UINT type brings.
If unsigned INT type exists, I wouldn't have to execute "create domain
UINT" in every database.

If "INT unsigned" and "SERIAL unsigned" exist, PostgreSQL would bring
more convenience to users.

This comes up now and then. The problem is the benefit gained is not
really worth the pain. In today's 64 bit world, choosing a 64 bit int
nails the cases where you need the extra range and you have the
ability to use constraints (if necessary, through a domain) to enforce
correctness.

On the downside, you have to add a lot of casts, overloads, etc.
Figuring out the casting rules is non trivial and could lead to
surprising behaviors...inferring the type of 'unknown' strings is bad
enough as it is.

TBH, what I'd greatly prefer to see is to have domains be finished up
so that you don't have to carefully consider their use (for example,
you can't make arrays out of them). Then an unsigned int could simply
be:

create domain uint as bigint check (value >= 0);

Even if we did that, there might still be cases where people would
want unsigned integers as a means of reducing storage. 4 extra bytes
may not seem like that much, but if you have billions of rows, it adds
up - not just in terms of actual storage space, but also in terms of
disk and memory bandwidth requirements when you want to do anything
with that data. I have seen some recent data (which is not entirely
conclusive) that suggests that memory bandwidth can be a huge problem
for PostgreSQL performance on large boxes; and I think Greg Smith has
made similar comments in the past (correct me if I'm wrong, Greg).

I think, though, that if we choose to attack that problem in the first
instance by adding support for unsigned integers, we're probably going
to be only nibbling around the edges of the problem. Reducing
alignment padding and adding block-level compression would benefit a
much larger number of workloads. Those are not easy projects, but
unfortunately, due to the constraints of our type system, neither is
this. :-(

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

mmoncure@gmail.com

over 14 years ago

In reply to: Robert Haas (#4)

Re: Is there any plan to add unsigned integer types?

On Mon, Sep 26, 2011 at 9:21 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Sep 26, 2011 at 10:02 AM, Merlin Moncure <mmoncure@gmail.com> wrote:

On Mon, Sep 26, 2011 at 5:41 AM, crocket <crockabiscuit@gmail.com> wrote:

MySQL already has unsigned INT type, and it has double the range of
signed INT type.
It's not just the bigger range that UINT type brings.
If unsigned INT type exists, I wouldn't have to execute "create domain
UINT" in every database.

If "INT unsigned" and "SERIAL unsigned" exist, PostgreSQL would bring
more convenience to users.

This comes up now and then. The problem is the benefit gained is not
really worth the pain. In today's 64 bit world, choosing a 64 bit int
nails the cases where you need the extra range and you have the
ability to use constraints (if necessary, through a domain) to enforce
correctness.

On the downside, you have to add a lot of casts, overloads, etc.
Figuring out the casting rules is non trivial and could lead to
surprising behaviors...inferring the type of 'unknown' strings is bad
enough as it is.

TBH, what I'd greatly prefer to see is to have domains be finished up
so that you don't have to carefully consider their use (for example,
you can't make arrays out of them). Then an unsigned int could simply
be:

create domain uint as bigint check (value >= 0);

Even if we did that, there might still be cases where people would
want unsigned integers as a means of reducing storage. 4 extra bytes
may not seem like that much, but if you have billions of rows, it adds
up - not just in terms of actual storage space, but also in terms of
disk and memory bandwidth requirements when you want to do anything
with that data. I have seen some recent data (which is not entirely
conclusive) that suggests that memory bandwidth can be a huge problem
for PostgreSQL performance on large boxes; and I think Greg Smith has
made similar comments in the past (correct me if I'm wrong, Greg).

I think, though, that if we choose to attack that problem in the first
instance by adding support for unsigned integers, we're probably going
to be only nibbling around the edges of the problem. Reducing
alignment padding and adding block-level compression would benefit a
much larger number of workloads. Those are not easy projects, but
unfortunately, due to the constraints of our type system, neither is
this. :-(

right -- exactly. most 'savings' in this vein are nothing but due to
padding and other factors such that (at least today) there is no
disadvantage to going to 64 bit in range constrained cases. also, I'd
submit history has been unkind to hardware dependent optimization
strategies in userland -- the engine should be dealing with this
problem. better to define your data the proper way and let the
assembly instruction counting gurus in -hackers worry about it :-).

compression is an interesting topic: the guys over at tokudb are
making some wild claims...i'm curious if they are real, and what the
real tradeoffs are.

merlin

Leonardo Francalanci

m_lists@yahoo.it

over 14 years ago

In reply to: Merlin Moncure (#5)

Re: Is there any plan to add unsigned integer types?

compression is an interesting topic: the guys over at tokudb are
making some wild claims...i'm curious if they are real, and what the
real tradeoffs are.

I don't know how much of the performance they claim comes from
compression and how much from the different indexing technique they
use (see the my post here, where nobody answered...

http://archives.postgresql.org/pgsql-general/2011-09/msg00615.php

)