Using UUID as a Primary Key

MySQL UUIDs

UUID is a 128-bit number represented by a utf8 string of five hexadecimal numbers separated by dashes. A UUID value looks like:

aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee

There are 5 different versions. MySQL uses UUID version 1.

  • Part 1: aaaaaaaa is generated from timestamp. Also referred to as time-low part
  • Part 2: bbbb is also generated from time-mid part of timestamp
  • Part3: cccc again generated from time-hi part of timestamp. The first c is the UUID version number (1 in this case)
  • Part4: dddd clock sequence
  • Part5: eeeeeeeeeeee Host unique identifier – provides spacial uniqueness. Generally MAC address of the main network interface or a random number if MAC address is not available.

The other commonly used UUID version is 4 – purely random values. The uuidgen tool found in most *NIX based systems, produces UUID version 4 values.

UUID values as Primary Key

Pros:

  • UUID values are globally unique across servers, databases and tables. This is particularly helpful in event driven system architecture.
  • Unique primary keys allow for merging or distribute data from different databases across servers.
  • Not easy to guess when using a web application URL e.g. https://domain.com/view/user/id/20 exposes the fact that users exist in the database that have numerical ids and this info can be exploited

Cons:

  • In the context of MySQL/InnoDB (detailed explanation at UUIDs are Popular, but Bad for Performance — Let’s Discuss)
  • Not efficient inserting purely random data into a B-Tree.
  • Secondary keys store Primary key values at leaf level. This increases the storage requirements with the number of secondary indexes you have.
  • Comparing integers is a lot faster that comparing strings.
  • UUID values (16 bytes) need more storage than integers (4 bytes) / bigint (8 bytes).

MySQL 8 Solution:

From MySQL v8.x, UUID values can be stored in a compact format (BINARY) and displayed in human-readable format (VARCHAR) with the help of following functions:

  • UUID_TO_BIN(string_uuid), UUID_TO_BIN(string_uuidswap_flag)
  • BIN_TO_UUID
  • IS_UUID
  • Use UUID_TO_BIN() with the second parameter set to true/1. This will swap the time-low and time-hi parts of the generated UUID values. This makes the value more useful to the B-Tree arrangement.
  • Use BIN_TO_UUID() to do the reverse. If you produce a binary UUID by calling UUID_TO_BIN() with a second argument of 1 to swap time parts, you should also pass a second argument of 1 to BIN_TO_UUID() to unswap the time parts when converting the binary UUID back to a string UUID:
Menu