GitHub is NOT the preferred viewer for this file. Please visit https://flux-framework.rtfd.io/projects/flux-rfc/en/latest/spec_19.html
This specification describes a scheme for a distributed, uncoordinated flux locally unique ID service that generates 64 bit k-ordered, unique identifiers that are a combination of timestamp since some epoch, generator id, and sequence number. The scheme is used to generate Flux job IDs.
Name | github.com/flux-framework/rfc/spec_19.rst |
Editor | Mark Grondona <mgrondona@llnl.gov> |
State | raw |
Very low probability of collision: Like 128 bit UUIDs on a global scale, FLUIDs should provide a reasonable guarantee against collisions on a smaller scale: within a Flux instance.
Distributed: Flux job ingest should be distributed to achieve a high job ingest rate, therefore the generation of ID’s should also be capable of being distributed to promote scalability.
Loosely ordered: Job ID’s generated by legacy resource managers are typically monotonically increasing integers that reflect job submission order. This property is of debatable utility, but following the principle of least astonishment, FLUIDs should retain it if possible.
The design of FLUIDs is patterned after Twitter Snowflake, and the derived implementation Boundary Flake. The basic scheme is to couple a timestamp, machine or generator id, and sequence number into a number of bits.
Placing the timestamp into the most significant bits of the ID allows independently generated IDs to be loosely sorted (k-ordered). Use of a separate machine ID per generator ensures uniqueness without coordination, and the sequence number ensures each generator can create a certain number of IDs per timestamp unit.
FLUIDs are composed of [ timestamp | id | sequence ]
similar to Snowflake,
to allow distributed, uncoordinated ID generation across a Flux instance,
with the allocation of bits customized for the unique use case in Flux:
- 40 bits for timestamp since epoch in milliseconds, good for a 35 year long runtime with custom epoch set to job start time.
- 14 bits for generator ID (up to 16K generators). By default, the generator ID could be set to the rank. For sessions greater than 16K ranks, some generators could be idled and forward requests up the tree to keep the max generators to 16K.
- 10 bits for sequence number (1024 IDs per ms)
With this scheme it is theoretically possible to create a max of about 16B FLUIDs per second for 34 years.
This type of generator guarantees unique IDs, with probability of collision equal to zero, so no collision detection is required.
A FLUID is a 64-bit integer, e.g. 6731191091817518
.
Representations other than decimal MAY be used where appropriate, for instance for compactness or ease of transcription over the phone.
The following sections describe the set of supported alternate representations for FLUIDs.
In order to create a compact, human readable representation of a FLUID, the main alternate encoding of a FLUID SHALL be Base58, using the alphabet
123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz
To disambiguate this Base58 representation from decimal or other
representations, FLUIDs with this encoding SHALL be prefixed with the
Unicode character U+0192 ƒ
, and the final result SHALL be termed
"FLUID base58 Encoding" or F58 for short. For example, the FLUID
6731191091817518
from above SHALL be represented in F58 as
ƒuZZybuNNy
. As a fallback mechanism, a FLUID prefixed with an
ASCII lowercase f
will also be decoded as F58.
Examples: ƒZemgA8Bzf
, ƒ278oEf7zGf
A hexadecimal encoding SHALL represent a FLUID in base16, including
a 0x
prefix to unambiguously differentiate the representation from
other FLUID standard encodings.
Examples: 0x17e9fb8df16c2e
, 0xedaf97d000000
In order to support indexing of FLUIDs in a hierarchical KVS namespace,
a dotted-hexadecimal encoding SHALL represent a FLUID in base16,
with each 4 hexadecimal digits separated by dots (.
).
Examples: 0017.e9fb.8df1.6c2e
, 000e.daf9.7d00.0000
In order to ease transferring of FLUIDs via human interaction, a mnemonic representation of FLUIDS SHALL be supported by a conformant implementation.
The mnemonicode implementation converts integers to strings of pronounceable words and back again. This encoding MAY be used when a FLUID must be conveyed by speaking, e.g. over the phone.
Examples: reform-remote-galileo--heart-package-academy
, random-idea-yoyo--sugar-printer-academy
In order to encode a FLUID using a minimal number of printable characters, and to increase visual appeal when displaying FLUIDs in various settings, a conformant implementation MAY supply an encoding of FLUIDs to a string of Unicode emoji characters.
The emoji used for encoding SHALL be selected to have the following properties:
- For maximum compatibility, the selected emoji SHALL be taken from the standard Unicode 6.0 2010 emoji set here: https://unicode.org/emoji/charts/emoji-versions.html#2010,
- The selected emoji SHALL be encoded as 4 bytes in UTF-8
- The selected emoji SHALL start with the common bytes: F0 9F
The implementation SHALL store the emoji in the order specified by the CLDR collation rules as in the above array and as shown here:
https://unicode.org/emoji/charts-12.1/emoji-ordering.txt
The above specifications result in a selection of 576 emoji here:
.. literalinclude:: data/spec_19/basemoji.h :language: C
The above table of 576 emoji SHALL be used to encode a FLUID in base 576 with the position of an emoji in the array above denoting its value. That is, the first emoji (U+1F603 😃 grinning face with big eyes) SHALL represent decimal 0, the second (U+1F604 😄 grinning face with smiling eyes) SHALL represent 1, etc.
Examples: 🚹💂🙌😳💱🏃
, 😄😹🎇📥🏧🙉🔞
, 🚹💂🈳💰🎩🏃
The standard FLUID representations described in this RFC are unambiguous by design. That is, the type of FLUID encoding can be definitively determined by the string representation.
Implementations that take an encoded FLUID as a string argument SHALL use the following rules to decode the argument:
- If a string contains
.
, then decode as "dothex"- Else if the string contains
-
, then decode as "words"- Else if the string starts with
ƒ
orf
, decode as F58- Else if the string begins with bytes
0xf0
and0x9f
then decode as "emoji"- Else if the string starts with
0x
decode as "hex"- Otherwise, decode as decimal
An implementation decoding FLUID string representations SHALL ignore leading and trailing whitespace.