-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
no domain error checks #4
Comments
When you say "domain error checks", are you referring to something like erroring on invalid characters in the input? |
Yepp, anything beyond the base64 alphabet chars. |
Interesting, yeah I have to check out Lemire. You're right, I guess my implementation just applies some piece wise linear mapping to the input characters after they've been interpreted as bytes by the built-in JS text encoder, to map them to the 0-63 range |
Which means characters out of the expected range will produce some random garbage |
However, I think I could just do a few more lanewise gt / lt instructions to check the bounds, and then error if any of them are true. Shouldn't add much overhead |
Yeah, you prolly can simply aggregate error conditions in another v128 variable, which only needs to be eval'ed once at the end. Would be enough for a good vs. bad data distinction (unless you want to return the position of the first faulty byte). I dont recommend to introduce branching in the data loop (means early error exit), this is really a perf killer in SIMD realms. |
Hmm I get almost no speed benefit with SIMD + error check compared to a uint32 LUT variant (its ~1.5 GB/s SIMD vs ~1.4 GB/s LUT). Still investigating whether SIMD can be further optimized (and I cannot go w'o error check, as the data comes untested from outside). |
Update: |
Here is my adoption of Mulas ideas with error checking: https://github.com/jerch/xterm-addon-image/blob/de66af18e1b06e0d244ae3632c4d308950c69b15/src/base64.wasm.ts#L102 (Its commented out, since I wont use it until Safari gets wasm-simd sorted out.) |
It seems the SIMD version does no domain error checks while decoding?
Imho you should at least mention that in the readme, so ppl are aware of this limitation and dont run into follow-up bugs by decoded bytes from broken base64 data.
On a sidenote - proper base64 decoding with SIMD is a remarkably tough issue - see the works of Lemire and Mula. Furthermore none of their speedy solutions works with wasm-simd, as they rely on 256 or 512 bit width.
The text was updated successfully, but these errors were encountered: