bigint out of range
Hello team ,
We are getting ERROR: bigint out of range. Please help on this.
ERROR: bigint out of range
kbdb=# INSERT INTO kb_dar_ran_url_check (url_hash,stat_date,topic_id,site_id,url,status,user_comments,review_comments) VALUES (72894677781133866997924561390146294513,E'19-04-2019',32793,1035,E'https://bikez.com/search/index.php',1,NULL,NULL);
ERROR: bigint out of range
Table structure is :
Table "kb_test.kb_dar_ran_url_check"
Column | Type | Collation | Nullable | Default
-----------------+-------------------------+-----------+----------+---------
status | bigint | | |
url_hash | bigint | | not null |
url | character varying(4000) | | not null |
review_comments | character varying(4000) | | |
user_comments | character varying(4000) | | |
stat_date | character varying(128) | | not null |
topic_id | numeric(38,0) | | not null |
site_id | numeric(38,0) | | not null |
Partition key: LIST (stat_date)
Thanks,
Daulat
On Thu, 16 May 2019 at 16:31, Daulat Ram <Daulat.Ram@exponential.com> wrote:
Hello team ,
We are getting ERROR: bigint out of range. Please help on this.
Bigint is -9223372036854775808 to 9223372036854775807.
https://www.postgresql.org/docs/current/datatype-numeric.html
On Thu, May 16, 2019 at 8:31 AM Daulat Ram <Daulat.Ram@exponential.com>
wrote:
url_hash | bigint | | not null |
Change the type of url_hash; make it text instead of bigint.
As a bonus:
Use text instead of arbitrary varchar(4000) fields and add, e.g., check
(length(url) < 4000) or something better.
Needing 38 digit integers for "id" fields seems odd.
Not sure what kind of date would need 128 characters to represent; ideally
you'd just make that a "date" typed field and normalize on input.
David J.
Adding to what David said, I don't see much point of having a bigint
status. Usually status attributes are a fixed set of values that can be
checked programmatically.
Show quoted text
On 2019-05-16 08:48:51 -0700, David G. Johnston wrote:
On Thu, May 16, 2019 at 8:31 AM Daulat Ram <Daulat.Ram@exponential.com> wrote:
url_hash | bigint | | not null |
Change the type of url_hash; make it text instead of bigint.
Or numeric(38, 0). I think it isn't coincidence that he tries to store
a 38-digit number in it.
As a bonus:
Use text instead of arbitrary varchar(4000) fields and add, e.g., check (length
(url) < 4000) or something better.Needing 38 digit integers for "id" fields seems odd.
I guess the schema was copied from Oracle. In Oracle, all numbers are
really 38 digit decimal floating point numbers and the limit for
varchar2 is 4000.
hp
--
_ | Peter J. Holzer | we build much bigger, better disasters now
|_|_) | | because we have much more sophisticated
| | | hjp@hjp.at | management tools.
__/ | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>
On Saturday, May 18, 2019, Peter J. Holzer <hjp-pgsql@hjp.at> wrote:
On 2019-05-16 08:48:51 -0700, David G. Johnston wrote:
On Thu, May 16, 2019 at 8:31 AM Daulat Ram <Daulat.Ram@exponential.com>
wrote:
url_hash | bigint | | not null |
Change the type of url_hash; make it text instead of bigint.
Or numeric(38, 0). I think it isn't coincidence that he tries to store
a 38-digit number in it.
You don’t perform math on a hash thus its not a number no matter that it
may contain only digits.
David J.
On 2019-05-18 10:49:53 -0700, David G. Johnston wrote:
On Saturday, May 18, 2019, Peter J. Holzer <hjp-pgsql@hjp.at> wrote:
On 2019-05-16 08:48:51 -0700, David G. Johnston wrote:
On Thu, May 16, 2019 at 8:31 AM Daulat Ram <Daulat.Ram@exponential.com>
wrote:
url_hash | bigint | | not null |
Change the type of url_hash; make it text instead of bigint.
Or numeric(38, 0). I think it isn't coincidence that he tries to store
a 38-digit number in it.You don’t perform math on a hash
That's not generally true. Hashes are used for further computation for
example in hash tables or in cryptography.
thus its not a number
This is just silly. All hash functions I have ever encountered compute a
single fixed size integer from a stream of integers. The result may be
larger than a machine word, in which case the representation in C (or a
similar low level language) may be an array of words (or bytes), but
it's still an integer.
hp
--
_ | Peter J. Holzer | we build much bigger, better disasters now
|_|_) | | because we have much more sophisticated
| | | hjp@hjp.at | management tools.
__/ | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>
On 5/18/19 2:27 PM, Peter J. Holzer wrote:
On 2019-05-18 10:49:53 -0700, David G. Johnston wrote:
On Saturday, May 18, 2019, Peter J. Holzer <hjp-pgsql@hjp.at> wrote:
On 2019-05-16 08:48:51 -0700, David G. Johnston wrote:
On Thu, May 16, 2019 at 8:31 AM Daulat Ram <Daulat.Ram@exponential.com>
wrote:
url_hash | bigint | | not null |
Change the type of url_hash; make it text instead of bigint.
Or numeric(38, 0). I think it isn't coincidence that he tries to store
a 38-digit number in it.You don’t perform math on a hash
That's not generally true. Hashes are used for further computation for
example in hash tables or in cryptography.
How is it "using math" to use a hash key in a hash lookup table?
--
Angular momentum makes the world go 'round.
On 2019-05-18 15:19:22 -0500, Ron wrote:
On 5/18/19 2:27 PM, Peter J. Holzer wrote:
On 2019-05-18 10:49:53 -0700, David G. Johnston wrote:
You don’t perform math on a hash
That's not generally true. Hashes are used for further computation for
example in hash tables or in cryptography.How is it "using math" to use a hash key in a hash lookup table?
hash modulo table size.
hp
--
_ | Peter J. Holzer | we build much bigger, better disasters now
|_|_) | | because we have much more sophisticated
| | | hjp@hjp.at | management tools.
__/ | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>
On 5/18/19 3:49 PM, Peter J. Holzer wrote:
On 2019-05-18 15:19:22 -0500, Ron wrote:
On 5/18/19 2:27 PM, Peter J. Holzer wrote:
On 2019-05-18 10:49:53 -0700, David G. Johnston wrote:
You don’t perform math on a hash
That's not generally true. Hashes are used for further computation for
example in hash tables or in cryptography.How is it "using math" to use a hash key in a hash lookup table?
hash modulo table size.
I've seen that used when the *tablespace* is pre-allocated, and you hash
modulo the tablespace *page number*. (Yes, performance tanks when you start
filling up pages.) How do you hash on the (ever growing) table size?
--
Angular momentum makes the world go 'round.
On 2019-05-18 17:14:59 -0500, Ron wrote:
On 5/18/19 3:49 PM, Peter J. Holzer wrote:
On 2019-05-18 15:19:22 -0500, Ron wrote:
On 5/18/19 2:27 PM, Peter J. Holzer wrote:
On 2019-05-18 10:49:53 -0700, David G. Johnston wrote:
You don’t perform math on a hash
That's not generally true. Hashes are used for further computation for
example in hash tables or in cryptography.How is it "using math" to use a hash key in a hash lookup table?
hash modulo table size.
I've seen that used when the tablespace is pre-allocated, and you hash modulo
the tablespace page number. (Yes, performance tanks when you start filling up
pages.) How do you hash on the (ever growing) table size?
The hash function returns a number in a range much larger than the
possible number of buckets. 64 bits is a good choice today.
To determine the bucket you need to reduce this number to something in
the range [0, nr_buckets). This is where modulo comes in:
i = h % nr_buckets
If the the table fills up, you increase nr_buckets, reallocate and
rehash all entries.
(If nr_buckets is a power of two, the modulo operation can be
efficiently implemented by using bitwise and)
hp
--
_ | Peter J. Holzer | we build much bigger, better disasters now
|_|_) | | because we have much more sophisticated
| | | hjp@hjp.at | management tools.
__/ | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>
On 5/18/19 5:39 PM, Peter J. Holzer wrote:
On 2019-05-18 17:14:59 -0500, Ron wrote:
On 5/18/19 3:49 PM, Peter J. Holzer wrote:
On 2019-05-18 15:19:22 -0500, Ron wrote:
On 5/18/19 2:27 PM, Peter J. Holzer wrote:
On 2019-05-18 10:49:53 -0700, David G. Johnston wrote:
You don’t perform math on a hash
That's not generally true. Hashes are used for further computation for
example in hash tables or in cryptography.How is it "using math" to use a hash key in a hash lookup table?
hash modulo table size.
I've seen that used when the tablespace is pre-allocated, and you hash modulo
the tablespace page number. (Yes, performance tanks when you start filling up
pages.) How do you hash on the (ever growing) table size?The hash function returns a number in a range much larger than the
possible number of buckets. 64 bits is a good choice today.To determine the bucket you need to reduce this number to something in
the range [0, nr_buckets). This is where modulo comes in:i = h % nr_buckets
If the the table fills up, you increase nr_buckets, reallocate and
rehash all entries.
Ouch. Response time on a big table would take a serious hit if that rehash
happened in the middle of the day on a big OLTP system. Even worse if it
were a 24x365 system, because you couldn't schedule an enlargement/rehash
during a down period.
--
Angular momentum makes the world go 'round.
On 2019-05-18 19:16:19 -0500, Ron wrote:
On 5/18/19 5:39 PM, Peter J. Holzer wrote:
On 2019-05-18 17:14:59 -0500, Ron wrote:
On 5/18/19 3:49 PM, Peter J. Holzer wrote:
On 2019-05-18 15:19:22 -0500, Ron wrote:
On 5/18/19 2:27 PM, Peter J. Holzer wrote:
On 2019-05-18 10:49:53 -0700, David G. Johnston wrote:You don’t perform math on a hash
That's not generally true. Hashes are used for further computation for
example in hash tables or in cryptography.How is it "using math" to use a hash key in a hash lookup table?
hash modulo table size.
I've seen that used when the tablespace is pre-allocated, and you hash modulo
the tablespace page number. (Yes, performance tanks when you start filling up
pages.) How do you hash on the (ever growing) table size?The hash function returns a number in a range much larger than the
possible number of buckets. 64 bits is a good choice today.To determine the bucket you need to reduce this number to something in
the range [0, nr_buckets). This is where modulo comes in:i = h % nr_buckets
If the the table fills up, you increase nr_buckets, reallocate and
rehash all entries.Ouch. Response time on a big table would take a serious hit if that rehash
happened in the middle of the day on a big OLTP system.
So that might be a reason not to use hash indexes tables where the worst
case insert time is a concern (the average insert time is still O(1)).
But please be aware that I answered your question 'How is it "using
math" to use a hash key?', not 'How are hash indexes in PostgreSQL
implemented?'. So my answer covered the most simple and generic
implementation.
You can split buckets lazily, and in fact PostgreSQL does this. I just
looked at the code briefly, but I don't really understand how it works.
Guess I would have to read Margo Seltzer's paper paper for that.
Still, even with that optimization (or tradeoff - it may make the lookup
slower) the README notes that splits are expensive.
hp
--
_ | Peter J. Holzer | we build much bigger, better disasters now
|_|_) | | because we have much more sophisticated
| | | hjp@hjp.at | management tools.
__/ | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>
On 5/19/19 5:43 AM, Peter J. Holzer wrote:
[snip]
But please be aware that I answered your question 'How is it "using
math" to use a hash key?', not 'How are hash indexes in PostgreSQL
implemented?'. So my answer covered the most simple and generic
implementation.
I understand.
--
Angular momentum makes the world go 'round.
Greetings,
* David G. Johnston (david.g.johnston@gmail.com) wrote:
On Thu, May 16, 2019 at 8:31 AM Daulat Ram <Daulat.Ram@exponential.com>
wrote:url_hash | bigint | | not null |
Change the type of url_hash; make it text instead of bigint.
Making it text wastes a bunch of space actually, since it's really a
binary value. I tend to recommend using a bytea for storing hashes.
Given that hashes are really fixed width, it'd be nice if we had a set
of proper datatypes for them, perhaps not so much to avoid the 1-byte
overhead from storing as a variable-length bytea, but because we could
then avoid having a 7-byte hole due to padding if the hash is followed
by a bigint or such.
Thanks,
Stephen
Greetings,
* Peter J. Holzer (hjp-pgsql@hjp.at) wrote:
On 2019-05-18 19:16:19 -0500, Ron wrote:
If the the table fills up, you increase nr_buckets, reallocate and
rehash all entries.Ouch. Response time on a big table would take a serious hit if that rehash
happened in the middle of the day on a big OLTP system.
As noted below, that isn't actually how it works with PG's hash indexes.
So that might be a reason not to use hash indexes tables where the worst
case insert time is a concern (the average insert time is still O(1)).
The worst-case insert time is certainly worse than the average but it's
not nearly as bad as being imagined here.
You can split buckets lazily, and in fact PostgreSQL does this. I just
looked at the code briefly, but I don't really understand how it works.
Guess I would have to read Margo Seltzer's paper paper for that.Still, even with that optimization (or tradeoff - it may make the lookup
slower) the README notes that splits are expensive.
There's multiple different levels, really, from "best case" where the
new item can just be placed directly on to a page that has free space,
to "slightly worse case" where an overflow page has to be added, to
"even worse case" where a prior page split has to be finished, to
"probably worst case" where a full page split has to happen. There
might even be some pathological cases where a prior page split has to be
finished and then an overflow page has to be added and then another page
split has to be done, or some such. Even so, for equality-based
lookups (where the data set has few or zero duplicates), hash indexes
can work quite well, but it's of course good to understand that inserts
aren't always going to be exactly the same speed every time.
Thanks,
Stephen