UUID or auto-increment

Started by Ashkar Devover 5 years ago13 messagesgeneral

ashkardev@gmail.com

over 5 years ago

Hi,
for web application is it needed to use UUID or auto-increment?

1- if two user inserts row at the same time, does it work?
2- dose the database give the same id for both users or execute one of them
first? ( I mean ID conflict not happens?)

Thanks.

Ravi Krishna

srkrishna@yahoo.com

over 5 years ago

In reply to: Ashkar Dev (#1)

Re: UUID or auto-increment

Both can handle concurrent writes. auto-increment is nothing but serial or sequence cols and they can handle unique concurrent request. That is why sometimes you may have gaps.UUID is not only unique, but is also unique across space. You can have two different databases generate UUID at the same time and it will still be unique. So that will help if you are consolidating different databases into one big data mart and they can all can go to the same table without conflict. With Sequence or Serial that will be a problem.Finally UUID results in write amplication in wal logs. Keep that in mind if your app does lot of writes.

Michael Lewis

mlewis@entrata.com

over 5 years ago

In reply to: Ravi Krishna (#2)

Re: UUID or auto-increment

UUID are also random and not correlated with time typically, so with a very
large table when accessing primarily recent data, hitting an index on a big
table will pull random pages into memory instead of primarily the end of
the index.

Ron

ronljohnsonjr@gmail.com

over 5 years ago

In reply to: Ravi Krishna (#2)

Re: UUID or auto-increment

On 8/10/20 11:38 AM, Ravi Krishna wrote:
[snip]

Finally UUID results in write amplication in wal logs. Keep that in mind
if your app does lot of writes.

Because UUID is 32 bytes, while SERIAL is 4 bytes?

--
Angular momentum makes the world go 'round.

Stephen Frost

sfrost@snowman.net

over 5 years ago

In reply to: Ron (#4)

Re: UUID or auto-increment

Greeitngs,

* Ron (ronljohnsonjr@gmail.com) wrote:

On 8/10/20 11:38 AM, Ravi Krishna wrote:

Finally UUID results in write amplication in wal logs. Keep that in mind
if your app does lot of writes.

Because UUID is 32 bytes, while SERIAL is 4 bytes?

and because it's random and so will touch a lot more pages when you're
using it...

Avoid UUIDs if you can- map them to something more sensible internally
if you have to deal with them.

Thanks,

Stephen

Adrian Klaver

adrian.klaver@aklaver.com

over 5 years ago

In reply to: Ron (#4)

Re: UUID or auto-increment

On 8/10/20 9:51 AM, Ron wrote:

On 8/10/20 11:38 AM, Ravi Krishna wrote:
[snip]

Finally UUID results in write amplication in wal logs. Keep that in
mind if your app does lot of writes.

Because UUID is 32 bytes, while SERIAL is 4 bytes?

You mean 32 digits for 128 bits?:

https://www.postgresql.org/docs/12/datatype-uuid.html

And there is BIGSERIAL which is 8 bytes.

--
Angular momentum makes the world go 'round.

--
Adrian Klaver
adrian.klaver@aklaver.com

Israel Brewster

ijbrewster@alaska.edu

over 5 years ago

In reply to: Stephen Frost (#5)

Re: UUID or auto-increment

---
Israel Brewster
Software Engineer
Alaska Volcano Observatory
Geophysical Institute - UAF
2156 Koyukuk Drive
Fairbanks AK 99775-7320
Work: 907-474-5172
cell: 907-328-9145

On Aug 10, 2020, at 8:53 AM, Stephen Frost <sfrost@snowman.net> wrote:

Greeitngs,

* Ron (ronljohnsonjr@gmail.com) wrote:

On 8/10/20 11:38 AM, Ravi Krishna wrote:

Finally UUID results in write amplication in wal logs. Keep that in mind
if your app does lot of writes.

Because UUID is 32 bytes, while SERIAL is 4 bytes?

and because it's random and so will touch a lot more pages when you're
using it...

I would point out, however, that using a V1 UUID rather than a V4 can help with this as it is sequential, not random (based on MAC address and timestamp + random). There is a trade off, of course, as with V1 if two writes occur on the same computer at the exact same millisecond, there is a very very small chance of generating conflicting UUID’s (see https://www.sohamkamani.com/blog/2016/10/05/uuid1-vs-uuid4/ <https://www.sohamkamani.com/blog/2016/10/05/uuid1-vs-uuid4/>). As there is still a random component, however, this seems quite unlikely.

Show quoted text

Avoid UUIDs if you can- map them to something more sensible internally
if you have to deal with them.

Thanks,

Stephen

Stephen Frost

sfrost@snowman.net

over 5 years ago

In reply to: Israel Brewster (#7)

Re: UUID or auto-increment

Greetings,

* Israel Brewster (ijbrewster@alaska.edu) wrote:

On Aug 10, 2020, at 8:53 AM, Stephen Frost <sfrost@snowman.net> wrote:
* Ron (ronljohnsonjr@gmail.com) wrote:

On 8/10/20 11:38 AM, Ravi Krishna wrote:

Finally UUID results in write amplication in wal logs. Keep that in mind
if your app does lot of writes.

Because UUID is 32 bytes, while SERIAL is 4 bytes?

and because it's random and so will touch a lot more pages when you're
using it...

I would point out, however, that using a V1 UUID rather than a V4 can help with this as it is sequential, not random (based on MAC address and timestamp + random). There is a trade off, of course, as with V1 if two writes occur on the same computer at the exact same millisecond, there is a very very small chance of generating conflicting UUID’s (see https://www.sohamkamani.com/blog/2016/10/05/uuid1-vs-uuid4/ <https://www.sohamkamani.com/blog/2016/10/05/uuid1-vs-uuid4/>). As there is still a random component, however, this seems quite unlikely.

Sure, that helps, but it's still not great, and they're still much, much
larger than you'd ever need for an identifier inside of a given system,
so best to map it to something reasonable and avoid them as much as
possible.

Thanks,

Stephen

Adam Brusselback

adambrusselback@gmail.com

over 5 years ago

In reply to: Israel Brewster (#7)

Re: UUID or auto-increment

I would point out, however, that using a V1 UUID rather than a V4 can

help with this as it is sequential, not random (based on MAC address and
timestamp + random)

I wanted to make this point, using sequential UUIDs helped me reduce write
amplification quite a bit with my application, I didn't use V1, instead I
used: https://pgxn.org/dist/sequential_uuids/

Reduces the pain caused by UUIDs a ton IMO.
-Adam

#10

Peter J. Holzer

hjp-pgsql@hjp.at

over 5 years ago

In reply to: Israel Brewster (#7)

Re: UUID or auto-increment

On 2020-08-10 09:10:00 -0800, Israel Brewster wrote:

I would point out, however, that using a V1 UUID rather than a V4 can
help with this as it is sequential, not random (based on MAC address
and timestamp + random).

If I read the specs correctly, a V1 UUID will roll over every 429
seconds. I think that as far as index locality is concerned, this is
essentially random for most applications.

#11

Israel Brewster

ijbrewster@alaska.edu

over 5 years ago

In reply to: Peter J. Holzer (#10)

Re: UUID or auto-increment

On Aug 10, 2020, at 12:06 PM, Peter J. Holzer <hjp-pgsql@hjp.at> wrote:

On 2020-08-10 09:10:00 -0800, Israel Brewster wrote:

I would point out, however, that using a V1 UUID rather than a V4 can
help with this as it is sequential, not random (based on MAC address
and timestamp + random).

If I read the specs correctly, a V1 UUID will roll over every 429
seconds. I think that as far as index locality is concerned, this is
essentially random for most applications.

According to wikipedia, the time value in a V1 UUID is a 60-bit number, and will roll over "around 3400AD”, depending on the algorithm used, or 5236AD if the software treats the timestamp as unsigned. This timestamp is extended by a 13 or 14-bit “uniqifying" clock sequence to handle cases of overlap, and then the 48bit MAC address (constant, so no rollover there) is appended. So perhaps that 13 or 14 bit “uniqifying” sequence will roll over every 429 seconds, however the timestamp *as a whole* won’t roll over for quite a while yet, thereby guaranteeing that the UUIDs will be sequential, not random (since, last I checked, time was sequential).

---
Israel Brewster
Software Engineer
Alaska Volcano Observatory
Geophysical Institute - UAF
2156 Koyukuk Drive
Fairbanks AK 99775-7320
Work: 907-474-5172
cell: 907-328-9145

Show quoted text

hp

--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"

#12

Rob Sargent

robjsargent@gmail.com

over 5 years ago

In reply to: Stephen Frost (#5)

Re: UUID or auto-increment

On 8/10/20 10:53 AM, Stephen Frost wrote:

Greeitngs,

* Ron (ronljohnsonjr@gmail.com) wrote:

On 8/10/20 11:38 AM, Ravi Krishna wrote:

Finally UUID results in write amplication in wal logs.ï¿½ Keep that in mind
if your app does lot of writes.

Because UUID is 32 bytes, while SERIAL is 4 bytes?

and because it's random and so will touch a lot more pages when you're
using it...

Avoid UUIDs if you can- map them to something more sensible internally
if you have to deal with them.

Thanks,

Stephen

I suspect the increased storage cost is more related to the size of the
record than to the ratio of the data types.

What says two consecutively saved records ought to be stored on the same
page or will likely be sought with the same search criterion. Serial
ids put a time order (loosely) on the data which may be completely
artificial.

#13

John W Higgins

wishdev@gmail.com

over 5 years ago

In reply to: Israel Brewster (#11)

Re: UUID or auto-increment

On Mon, Aug 10, 2020 at 1:45 PM Israel Brewster <ijbrewster@alaska.edu>
wrote:

On Aug 10, 2020, at 12:06 PM, Peter J. Holzer <hjp-pgsql@hjp.at> wrote:

On 2020-08-10 09:10:00 -0800, Israel Brewster wrote:

I would point out, however, that using a V1 UUID rather than a V4 can
help with this as it is sequential, not random (based on MAC address
and timestamp + random).

If I read the specs correctly, a V1 UUID will roll over every 429
seconds. I think that as far as index locality is concerned, this is
essentially random for most applications.

According to wikipedia, the time value in a V1 UUID is a 60-bit number,
and will roll over "around 3400AD”, depending on the algorithm used, or
5236AD if the software treats the timestamp as unsigned. This timestamp is
extended by a 13 or 14-bit “uniqifying" clock sequence to handle cases of
overlap, and then the 48bit MAC address (constant, so no rollover there) is
appended. So perhaps that 13 or 14 bit “uniqifying” sequence will roll over
every 429 seconds, however the timestamp *as a whole* won’t roll over for
quite a while yet, thereby guaranteeing that the UUIDs will be sequential,
not random (since, last I checked, time was sequential).

Except the time portion of a V1 UUID is not written high to low but rather
low then middle then high which means that the time portion is not
expressed in a sequential format and the left 8 chars of a V1 UUID
"rollover" every 429 seconds or so.

For example a V1 UUID right around now looks like

7db3f2ba-db4f-11ea-87d0-0242ac130003

Less than a second later

7db534cc-db4f-11ea-87d0-0242ac130003

So that looks sequential but in roughly 429 seconds it will look like

7db3f2ba-db4f-11ea-87d1-0242ac130003

More importantly in other roughly 300 seconds it would be something like

6ab3f2ba-db4f-11ea-87d2-0242ac130003

Note the move from 87d0 to 87d1 and 87d2 in the middle but the left 8 bytes
"rollover".

That's not quite sequential in terms of indexing.

John