UUID or auto-increment
Hi,
for web application is it needed to use UUID or auto-increment?
1- if two user inserts row at the same time, does it work?
2- dose the database give the same id for both users or execute one of them
first? ( I mean ID conflict not happens?)
Thanks.
Both can handle concurrent writes. auto-increment is nothing but serial or sequence cols and they can handle unique concurrent request. That is why sometimes you may have gaps.UUID is not only unique, but is also unique across space. You can have two different databases generate UUID at the same time and it will still be unique. So that will help if you are consolidating different databases into one big data mart and they can all can go to the same table without conflict. With Sequence or Serial that will be a problem.Finally UUID results in write amplication in wal logs. Keep that in mind if your app does lot of writes.
UUID are also random and not correlated with time typically, so with a very
large table when accessing primarily recent data, hitting an index on a big
table will pull random pages into memory instead of primarily the end of
the index.
On 8/10/20 11:38 AM, Ravi Krishna wrote:
[snip]
Finally UUID results in write amplication in wal logs. Keep that in mind
if your app does lot of writes.
Because UUID is 32 bytes, while SERIAL is 4 bytes?
--
Angular momentum makes the world go 'round.
Greeitngs,
* Ron (ronljohnsonjr@gmail.com) wrote:
On 8/10/20 11:38 AM, Ravi Krishna wrote:
Finally UUID results in write amplication in wal logs. Keep that in mind
if your app does lot of writes.Because UUID is 32 bytes, while SERIAL is 4 bytes?
and because it's random and so will touch a lot more pages when you're
using it...
Avoid UUIDs if you can- map them to something more sensible internally
if you have to deal with them.
Thanks,
Stephen
On 8/10/20 9:51 AM, Ron wrote:
On 8/10/20 11:38 AM, Ravi Krishna wrote:
[snip]Finally UUID results in write amplication in wal logs. Keep that in
mind if your app does lot of writes.Because UUID is 32 bytes, while SERIAL is 4 bytes?
You mean 32 digits for 128 bits?:
https://www.postgresql.org/docs/12/datatype-uuid.html
And there is BIGSERIAL which is 8 bytes.
--
Angular momentum makes the world go 'round.
--
Adrian Klaver
adrian.klaver@aklaver.com
---
Israel Brewster
Software Engineer
Alaska Volcano Observatory
Geophysical Institute - UAF
2156 Koyukuk Drive
Fairbanks AK 99775-7320
Work: 907-474-5172
cell: 907-328-9145
On Aug 10, 2020, at 8:53 AM, Stephen Frost <sfrost@snowman.net> wrote:
Greeitngs,
* Ron (ronljohnsonjr@gmail.com) wrote:
On 8/10/20 11:38 AM, Ravi Krishna wrote:
Finally UUID results in write amplication in wal logs. Keep that in mind
if your app does lot of writes.Because UUID is 32 bytes, while SERIAL is 4 bytes?
and because it's random and so will touch a lot more pages when you're
using it...
I would point out, however, that using a V1 UUID rather than a V4 can help with this as it is sequential, not random (based on MAC address and timestamp + random). There is a trade off, of course, as with V1 if two writes occur on the same computer at the exact same millisecond, there is a very very small chance of generating conflicting UUID’s (see https://www.sohamkamani.com/blog/2016/10/05/uuid1-vs-uuid4/ <https://www.sohamkamani.com/blog/2016/10/05/uuid1-vs-uuid4/>). As there is still a random component, however, this seems quite unlikely.
Show quoted text
Avoid UUIDs if you can- map them to something more sensible internally
if you have to deal with them.Thanks,
Stephen
Greetings,
* Israel Brewster (ijbrewster@alaska.edu) wrote:
On Aug 10, 2020, at 8:53 AM, Stephen Frost <sfrost@snowman.net> wrote:
* Ron (ronljohnsonjr@gmail.com) wrote:On 8/10/20 11:38 AM, Ravi Krishna wrote:
Finally UUID results in write amplication in wal logs. Keep that in mind
if your app does lot of writes.Because UUID is 32 bytes, while SERIAL is 4 bytes?
and because it's random and so will touch a lot more pages when you're
using it...I would point out, however, that using a V1 UUID rather than a V4 can help with this as it is sequential, not random (based on MAC address and timestamp + random). There is a trade off, of course, as with V1 if two writes occur on the same computer at the exact same millisecond, there is a very very small chance of generating conflicting UUID’s (see https://www.sohamkamani.com/blog/2016/10/05/uuid1-vs-uuid4/ <https://www.sohamkamani.com/blog/2016/10/05/uuid1-vs-uuid4/>). As there is still a random component, however, this seems quite unlikely.
Sure, that helps, but it's still not great, and they're still much, much
larger than you'd ever need for an identifier inside of a given system,
so best to map it to something reasonable and avoid them as much as
possible.
Thanks,
Stephen
I would point out, however, that using a V1 UUID rather than a V4 can
help with this as it is sequential, not random (based on MAC address and
timestamp + random)
I wanted to make this point, using sequential UUIDs helped me reduce write
amplification quite a bit with my application, I didn't use V1, instead I
used: https://pgxn.org/dist/sequential_uuids/
Reduces the pain caused by UUIDs a ton IMO.
-Adam
On 2020-08-10 09:10:00 -0800, Israel Brewster wrote:
I would point out, however, that using a V1 UUID rather than a V4 can
help with this as it is sequential, not random (based on MAC address
and timestamp + random).
If I read the specs correctly, a V1 UUID will roll over every 429
seconds. I think that as far as index locality is concerned, this is
essentially random for most applications.
hp
--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"
On Aug 10, 2020, at 12:06 PM, Peter J. Holzer <hjp-pgsql@hjp.at> wrote:
On 2020-08-10 09:10:00 -0800, Israel Brewster wrote:
I would point out, however, that using a V1 UUID rather than a V4 can
help with this as it is sequential, not random (based on MAC address
and timestamp + random).If I read the specs correctly, a V1 UUID will roll over every 429
seconds. I think that as far as index locality is concerned, this is
essentially random for most applications.
According to wikipedia, the time value in a V1 UUID is a 60-bit number, and will roll over "around 3400AD”, depending on the algorithm used, or 5236AD if the software treats the timestamp as unsigned. This timestamp is extended by a 13 or 14-bit “uniqifying" clock sequence to handle cases of overlap, and then the 48bit MAC address (constant, so no rollover there) is appended. So perhaps that 13 or 14 bit “uniqifying” sequence will roll over every 429 seconds, however the timestamp *as a whole* won’t roll over for quite a while yet, thereby guaranteeing that the UUIDs will be sequential, not random (since, last I checked, time was sequential).
---
Israel Brewster
Software Engineer
Alaska Volcano Observatory
Geophysical Institute - UAF
2156 Koyukuk Drive
Fairbanks AK 99775-7320
Work: 907-474-5172
cell: 907-328-9145
Show quoted text
hp
--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"
On 8/10/20 10:53 AM, Stephen Frost wrote:
Greeitngs,
* Ron (ronljohnsonjr@gmail.com) wrote:
On 8/10/20 11:38 AM, Ravi Krishna wrote:
Finally UUID results in write amplication in wal logs.� Keep that in mind
if your app does lot of writes.Because UUID is 32 bytes, while SERIAL is 4 bytes?
and because it's random and so will touch a lot more pages when you're
using it...Avoid UUIDs if you can- map them to something more sensible internally
if you have to deal with them.Thanks,
Stephen
I suspect the increased storage cost is more related to the size of the
record than to the ratio of the data types.
What says two consecutively saved records ought to be stored on the same
page or will likely be sought with the same search criterion. Serial
ids put a time order (loosely) on the data which may be completely
artificial.
On Mon, Aug 10, 2020 at 1:45 PM Israel Brewster <ijbrewster@alaska.edu>
wrote:
On Aug 10, 2020, at 12:06 PM, Peter J. Holzer <hjp-pgsql@hjp.at> wrote:
On 2020-08-10 09:10:00 -0800, Israel Brewster wrote:
I would point out, however, that using a V1 UUID rather than a V4 can
help with this as it is sequential, not random (based on MAC address
and timestamp + random).If I read the specs correctly, a V1 UUID will roll over every 429
seconds. I think that as far as index locality is concerned, this is
essentially random for most applications.According to wikipedia, the time value in a V1 UUID is a 60-bit number,
and will roll over "around 3400AD”, depending on the algorithm used, or
5236AD if the software treats the timestamp as unsigned. This timestamp is
extended by a 13 or 14-bit “uniqifying" clock sequence to handle cases of
overlap, and then the 48bit MAC address (constant, so no rollover there) is
appended. So perhaps that 13 or 14 bit “uniqifying” sequence will roll over
every 429 seconds, however the timestamp *as a whole* won’t roll over for
quite a while yet, thereby guaranteeing that the UUIDs will be sequential,
not random (since, last I checked, time was sequential).
Except the time portion of a V1 UUID is not written high to low but rather
low then middle then high which means that the time portion is not
expressed in a sequential format and the left 8 chars of a V1 UUID
"rollover" every 429 seconds or so.
For example a V1 UUID right around now looks like
7db3f2ba-db4f-11ea-87d0-0242ac130003
Less than a second later
7db534cc-db4f-11ea-87d0-0242ac130003
So that looks sequential but in roughly 429 seconds it will look like
7db3f2ba-db4f-11ea-87d1-0242ac130003
More importantly in other roughly 300 seconds it would be something like
6ab3f2ba-db4f-11ea-87d2-0242ac130003
Note the move from 87d0 to 87d1 and 87d2 in the middle but the left 8 bytes
"rollover".
That's not quite sequential in terms of indexing.
John