Value.Parse performance #1151
Replies: 2 comments
-
@denis-ilchishin Hi!, Sorry, I meant to respond to this sooner via discussion thread, but have been somewhat side tracked getting https://github.com/sinclairzx81/typemap ready for Standard Schema 1.0.
The TypeBox Parse function isn't really designed for high performance. The function was written to be an all-in kitchen sink pipeline which internally calls to Clone, Clean, Default, Convert, Assert and Decode to process a value. Parse pays a operational cost of all of these functions run in sequence. Parse ConfigurationYou can optimize the Parse pipeline by configuring it, (or even replacing operations with more optimized versions). The following benchmarks a few variations of the default pipeline. import { Value } from '@sinclair/typebox/value'
import { Type } from '@sinclair/typebox'
function benchmark(operations: string[]) {
const name = operations.length > 0 ? operations.join(', ') : 'Noop'
console.time(name)
const type = Type.Object({
x: Type.Number(),
y: Type.Number(),
z: Type.Number()
})
const value = { x: 1, y: 2, z: 3 }
for(let i = 0; i < 100000; i++) Value.Parse(operations, type, value)
console.timeEnd(name)
}
benchmark(['Clone', 'Clean', 'Default', 'Convert', 'Assert', 'Decode']) // default pipeline
benchmark(['Clean', 'Default', 'Convert', 'Assert', 'Decode'])
benchmark(['Clean', 'Default', 'Convert', 'Assert'])
benchmark(['Clean', 'Default', 'Assert'])
benchmark(['Clean', 'Assert'])
benchmark(['Assert'])
benchmark([]) Results Clone, Clean, Default, Convert, Assert, Decode: 176.815ms
Clean, Default, Convert, Assert, Decode: 126.326ms
Clean, Default, Convert, Assert: 87.026ms
Clean, Default, Assert: 70.661ms
Clean, Assert: 34.463ms
Assert: 18.255ms -- very comparable to Zod
Noop: 6.494ms Any potential optimizations would need to be implemented at a operation level, not a Parse level. (TypeBox narrows to scope for optimization to each operation by design) Parse Extensions
Yeah, this is true. TypeBox Value functions are written as distinct / decoupled operations that require value traversal for each operation run. This is by design as these functions may be JIT optimized in future (and it's easier to optimize a function that does one thing than a function that does many things) However, I do agree some optimizations would be possible by avoiding repeated traversal per operation. It would be technically possible to implement a single traversal function in user space, but would require implementing a ne w parsing system from scratch. This could be configured in the following way. import { Value, ParseRegistry } from '@sinclair/typebox/value'
import { Type, TSchema, Static } from '@sinclair/typebox'
// ------------------------------------------------------------------
// FastParse
// ------------------------------------------------------------------
ParseRegistry.Set('FastParse', (schema, references, value) => {
return value // todo: implement all operations as single operation here
})
function FastParse<Type extends TSchema>(schema: Type, value: unknown): Static<Type> {
return Value.Parse(['FastParse'], schema, value)
}
// ...
const result = FastParse(Type.String(), 'hello')
console.log(result) Answers to Questions
Not using the current Value.* functions, these perform deep traversal by default. You would need to design a new set of functions that only operate at a single level of depth (traversal facilitated by some exterior recursive visitor, not within the functions themselves which should only be limited to mapping Input -> Output) This has partially been explored before, but would be happy to assist community implementations.
Caching would be possible external to TypeBox, but not internal. Caching optimization internal to TypeBox have been attempted before but resulted in too much complexity and nuance to support, as such caching schematics should happen exterior to TypeBox (for example, caching via As for the following, TypeBox can't inject Compiler functionality into the Value.* functions. As mentioned, the Value.* functions may be JIT optimized in future, and I am quite hesitant to introduce coupling between the Compiler to Value.* as this would complicate future JIT optimization work. // coupling is out of scope.
const checkFn = checks.get(subschema) ?? Check.bind(null, subschema) Hope this brings some insight into things. Happy to discuss optimizations at the Value operation level, but not at a Parse level. The current functions should be fairly optimized, but there is likely room to improve performance on certain types (Union and Intersect especially), so would be open to discussing better implementations if possible (it would likely require research) Again, sorry for the delay, Should I convert this back to discussion? |
Beta Was this translation helpful? Give feedback.
-
Yeah, let's move it back to discussions then. I'm not sure how discussions work and thought you just don't see any notifications or something. |
Beta Was this translation helpful? Give feedback.
-
Hi, I've been playing around with Typebox and I'm a bit concerned about the performance of Value.Default, Value.Clean, etc.
I've created a simple playground repo to do some benchmarking and here's some preview:
I'm totally aware that micro-benchmarking is not very representative, and there's ton of different things that can impact the performace, as well as fundamental difference between how zod and typebox operate. Therefore, this numbers are just for a reference and something to base the conversation on.
From my inverstigation, unions, nested object and arrays have the biggest impact on performance, which sounds very reasonable. Well, why that is the case is also pretty obvious: by the architectural design, all the actions (apply default values, clean extra properties, etc..) in typebox are built to be independetly consumable. But this leads to a not very satisfactory performance, when for some usecase its requried to pass a value through Typebox parse pipeline with multiple actions — most of which deeply traverse the value — therefore doing pretty expensive recursive traverse operations for EACH step. And this one of the biggest differences of how zod operates (does all the validations, default values resolution, extra properties cleanup, etc. in one go).
Another thing that I noticed is that, for example, for Value.Default to correctly resolve and apply default values for union types, its necessary to run Check on the value for each subschema/subtype to figure out the schema to take the default value from. But in this case it always uses runtime validation, which can be significantly slower compared to a complied validation (example from typebox's readme)
So, I wonder, if there's room for improvement here? Do you see any way to optimize it? Here's couple of things I've been thinking about:
for every action, make it that the list of actions to be performed is passed to each type specific function
check
functions of all nested types/schemas, so they can be passed to a Value.Default (and others) and be accessed there, just like you do with references, something like this:This, probably, should significantly improve performance for union and intersaction types
Beta Was this translation helpful? Give feedback.
All reactions