-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v0.2.0: RFC-9562 Support #53
Comments
Yes, all of the tools to implement these should be in place. I’ll take a look
Thanks,
Dan
… On Jul 1, 2024, at 12:19 PM, Chris Hapgood ***@***.***> wrote:
This library has worked really well for us generating deterministic (v5) UUIDs. We also would benefit from using the new v7 UUIDs and would prefer to centralize UUID support in clj-uuid. Is there any plan to add support for UUID Version 7 <https://www.ietf.org/archive/id/draft-peabody-dispatch-new-uuid-format-04.html>?
—
Reply to this email directly, view it on GitHub <#53>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AC4YNXXHIXIOK7I5XQ5O6FLZKF6PJAVCNFSM6AAAAABKF4RTGSVHI2DSMVQWIX3LMV43ASLTON2WKOZSGM4DIMRTHE2DAOA>.
You are receiving this because you are subscribed to this thread.
|
Waiting for v7 support - I can see the code changes in branch "53-v7-support" from last July - any release date planned? |
This new RFC 9562 section 6.2 has a very cryptic prescription of the v7 implementation, which will yield many different implementations with different "guarantees" about the monotonicity and the associated lexicographical order of the generated uuidv7's - a recipe for many future obscure bugs! |
A few more observations and suggestions: The counter can run from 0 to 4095 but is initialized to a random start value between those values to add a little more random and to obfuscate correlations between uuid-values from the same generator for privacy reasons. When the counter runs out, the generator will spin in a loop-recur till the next millisecond arrives. This should be avoided as much as possible as that cpu-core will show 100% utilization just spinning for that duration. One option to limit the chance to loop-recur it to reinitialize the counter to 0 instead of randomly between 0-4095. In that way you will always be able to generate the maximum uuids within a millisecond and limit the chance to run out the counter. IMHO the loss of privacy and random is not substantial ... also the rfc is not very prescriptive about this... Another option is to sleep for a number of nanoseconds instead of spinning in that loop-recur. You could use a java.time.Instant that has a nanosecond precision (although on my Mac it yields only relevant microseconds), and call Thread/sleep with the number of nanoseconds till the next millisecond arrives. This kind of works in my repl... but don't want to spend more time investigating unless it becomes more relevant. |
After reading some of the discussions about the other java based implementations of v7, I don't think that the limitations are clear about this guaranteed lexical monotonicity. As far as I understand, you have to maintain state in order to generate the monotonic uuids. When you randomize the 12-bit after the version number, then you can have only one uuid per ms and if an implementation generates more than one uuid per ms, then they essentially force the client to check and discard uuids when monotonicity is a hard requirement. I've tested the previously suggested code changes about using a nanosecond timer to avoid spinning in a loop-recur, and the following seems to work ok (?). (let [-state- (atom (->State 0 0))]
(defn monotonic-unix-time-and-counter []
(let [^State new-state
(swap! -state-
(fn [^State current-state]
(loop [time-now (java.time.Instant/now)]
(let [time-now-epoch-millis (.toEpochMilli time-now)
nanos (.getNano time-now)
nanos-till-ms (min 999999 (- 1000000 (rem nanos 1000000)))]
(if-not (= (.millis current-state) time-now-epoch-millis)
(->State 0 time-now-epoch-millis)
(let [tt (.seqid current-state)]
(if (< tt +random-counter-resolution+)
(->State (inc tt) time-now-epoch-millis)
(do ;; recur when counter is out of runway - sleep until new millisecond
(java.lang.Thread/sleep 0 nanos-till-ms)
(recur (java.time.Instant/now))))))))))]
[(.millis new-state) (.seqid new-state)]))) |
UUID v7 support is anxiously anticipated for use with Datomic! https://clojurians.slack.com/archives/C03RZMDSH/p1725976130910969 |
ok -- sorry I was out of office. Lets see if we can get this done quickly |
@franks42 this is pretty cool. I was also not feeling comfortable about not using all of the available bits and initializing to random number. This makes sense to me -- does anyone disagree? "Monotonicity can only be guaranteed by a single thread UUID generator" seems like a big caveat -- is it actually better to use loop/recur if it provides a monotonicity guarantee? |
My thought is "correctness first". If everyone agrees that the current implementation is correct and continues to provide a monoticity guarantee? |
The rfc is difficult to parse about what a “correct” implementation of v7 should look like.It feels like they seem to leave the monotonicity guarantee open - up to the implementer or more correctly the v7 generator service.The service level guarantee for a particular v7 generator service could be that the monotonicity is only guaranteed for the time component and that it’s the client’s responsibility to keep track of possible collisions if the monotonicity is important for the client’s use case. The partial monotonicity may be good enough for clients for whom a clustering of uuids around the millisecond precision of the uuid’s time component is all they need.If you require true monotonicity, then the SLA of a dedicated v7 generator could guarantee that for the whole uuid… but the price to pay is an absolute maximum number of uuids that can be generated: 1 per millisecond till upto 4000 per millisecond… depending on how the 12 bits are used.Guess any v7 generator should come with an SLA such that the client knows what to expect and whether or not it meets its requirements.Is your understanding of the rfc “similar”?Regards, Frank.On Sep 15, 2024, at 9:56 PM, Dan Lentz ***@***.***> wrote:
My thought is "correctness first". If everyone agrees that the current implementation is correct and continues to provide a monoticity guarantee?
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***>
|
@franks42 and others. yes reading though chapter 6 again, but, also, yes -- I can locally reproduce the monotonicity failures in the v7 concurrency tests. I have not yet seen v6 fail in that same way. edit -- nvm -- as soon as i said that I hit one in another window. |
@franks42 i did a criterium benchmark on (require '[criterium.core :as criterium :refer [bench]])
(bench (and (repeatedly 100000000 monotonic-unix-time-and-random-counter) nil))
;; Evaluation count : 28688470740 in 60 samples of 478141179 calls.
;; Execution time mean : 0.414950 ns
;; Execution time std-deviation : 0.044658 ns
;; Execution time lower quantile : 0.379115 ns ( 2.5%)
;; Execution time upper quantile : 0.501328 ns (97.5%)
;; Overhead used : 1.640844 ns
(bench (and (repeatedly 100000000 monotonic-unix-time-and-counter) nil))
;; Evaluation count : 17791729260 in 60 samples of 296528821 calls.
;; Execution time mean : 1.794180 ns
;; Execution time std-deviation : 0.074041 ns
;; Execution time lower quantile : 1.730718 ns ( 2.5%)
;; Execution time upper quantile : 1.930633 ns (97.5%)
;; Overhead used : 1.640844 ns |
(Note that I'm traveling and won't be able to make code changes and run
tests for another week...)
Interesting results - seems like my version is about 4-5 times slower than
yours when you run it in a tight loop for 100 million times.
As far as I can see, the differences in the code are:
* use of `java.time.Instant/now` vs `System/currentTimeMillis`
Would be interesting to see a benchmark between those two calls to see if
that could explain (?)
* start counting from 0 vs random value between 0-4095
Doesn't feel that this difference can account for the benchmark results -
probably easy to make your implementation start at 0 for a test (?)
* `java.lang.Thread/sleep` vs loop-recur spinning till the next ms
when you replace the sleep with the recur of your implementation... would
that make a difference?
When you run in such a tight loop without the need for random-number
generation for the rest of the uuid fields (except for setting the initial
counter), then both implementations should easily hit the 4095 threshold
every ms when the average call-time is < 2ns. This would mean that both
implementations will be called 4095 times in a ms and then would either be
sleeping or spinning the rest of that ms in that single last call of that
ms... Intuitively that should make both implementations yield the exact
same number of calls per ms and the benchmark numbers should be identical
(?). (4095 calls per ms or about 250ns per call which is very different
from the benchmark)
What am I missing?
Intriguing ;-)
Regards, Frank.
…On Thu, Sep 19, 2024 at 10:58 PM Dan Lentz ***@***.***> wrote:
@franks42 <https://github.com/franks42> i did a criterium benchmark on
monotonic-unix-time-and-counter and found that (at least in a single
threaded example) it didn't perform as well. would you have expected that
to be the case?
(require '[criterium.core :as criterium :refer [bench]])
(bench (and (repeatedly 100000000 monotonic-unix-time-and-random-counter) nil))
;; Evaluation count : 28688470740 in 60 samples of 478141179 calls.;; Execution time mean : 0.414950 ns;; Execution time std-deviation : 0.044658 ns;; Execution time lower quantile : 0.379115 ns ( 2.5%);; Execution time upper quantile : 0.501328 ns (97.5%);; Overhead used : 1.640844 ns
(bench (and (repeatedly 100000000 monotonic-unix-time-and-counter) nil))
;; Evaluation count : 17791729260 in 60 samples of 296528821 calls.;; Execution time mean : 1.794180 ns;; Execution time std-deviation : 0.074041 ns;; Execution time lower quantile : 1.730718 ns ( 2.5%);; Execution time upper quantile : 1.930633 ns (97.5%);; Overhead used : 1.640844 ns
—
Reply to this email directly, view it on GitHub
<#53 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABMHXVVQZAS77X5KHAHCYDZXLQ75AVCNFSM6AAAAABKF4RTGSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNRRGI2TGMZUGM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Ok, Just an update. I've improved the handling of out-of-order timestamps and also the rigor of the tests, including multithreaded monotonicity for v1, v6, and v7 UUID's. In this version, I decreased the random initialization of the v7 subcounter to 8 bits, which should almost double the average number of UUID's per ms (and the spec reads "randomly initialized counter -- so I don't see that as necessarily requiring all of the counter bits to be used for that randomness). Even so, the secure random aspects of v7 do have a cost. user> (bench (uuid/v1))
Evaluation count : 599008800 in 60 samples of 9983480 calls.
Execution time mean : 98.724581 ns
user> (bench (uuid/v6))
Evaluation count : 599782920 in 60 samples of 9996382 calls.
Execution time mean : 98.799153 ns
user> (bench (uuid/v4))
Evaluation count : 231001080 in 60 samples of 3850018 calls.
Execution time mean : 270.384250 ns
user> (bench (uuid/v7))
Evaluation count : 121301880 in 60 samples of 2021698 calls.
Execution time mean : 500.258992 ns
If anyone has a time to look through, I'd appreciate any feedback. Otherwise I've got some more work to do on the documentation to get it ready for release. |
@cch1 note the updates in |
ok -- documentation has been updated and clj-uuid 0.2.0-SNAPSHOT is now available for use review on Clojars: https://clojars.org/danlentz/clj-uuid/versions/0.2.0-SNAPSHOT Greatly appreciate all of the input (and encouragement!). I'm planning on merging and releasing 0.2.0 shortly, unless there is any feedback? |
Awesome -thanks for this work, @danlentz . |
This library has worked really well for us generating deterministic (v5) UUIDs. We also would benefit from using the new v7 UUIDs and would prefer to centralize UUID support in
clj-uuid
. Is there any plan to add support for UUID Version 7?The text was updated successfully, but these errors were encountered: