-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improved naming of anonymous types and types from files outside of functions.ts
#21
Conversation
functions.ts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking that two anonymous object types with different field orderings produce the same name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is slick. I'm worried that inference.ts is getting very complicated, but I suppose that is by necessity. Maybe at some point we could split out a "library" for the structural/nominal concerns in order to simplify the business logic aspects of schema inference. No holdup on merging this PR.
function cloneTypeDerivationContext(context: TypeDerivationContext): TypeDerivationContext { | ||
return { | ||
objectTypeDefinitions: structuredClone(context.objectTypeDefinitions), | ||
scalarTypeDefinitions: structuredClone(context.scalarTypeDefinitions), | ||
customTypeNameRegistry: context.customTypeNameRegistry.clone(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this cloned?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because this cloning function is used in places (relaxed types) where we run the type derivation algorithm but throw away the results, so we clone the state and discard it after.
const compareTypeId = (typeIdA: TypeId, typeIdB: TypeId): number => { | ||
if (typeIdA.t === "d" && typeIdB.t == "d") { | ||
return typeIdA.f.localeCompare(typeIdB.f) | ||
|| typeIdA.s - typeIdB.s | ||
|| compareList(compareTypeId, typeIdA.ta, typeIdB.ta); | ||
} else if (typeIdA.t === "i" && typeIdB.t == "i") { | ||
return typeIdA.i.localeCompare(typeIdB.i); | ||
} else if (typeIdA.t === "l-n" && typeIdB.t == "l-n") { | ||
return typeIdA.v - typeIdB.v; | ||
} else if (typeIdA.t === "l-s" && typeIdB.t == "l-s") { | ||
return typeIdA.v.localeCompare(typeIdB.v); | ||
} else if (typeIdA.t === "l-bi" && typeIdB.t == "l-bi") { | ||
const aStr = `${typeIdA.v.negative ? "-" : ""}${typeIdA.v.base10Value}`; | ||
const bStr = `${typeIdB.v.negative ? "-" : ""}${typeIdB.v.base10Value}`; | ||
return aStr.localeCompare(bStr); | ||
} else if (typeIdA.t === "o" && typeIdB.t == "o") { | ||
return compareList( | ||
([nameA, pTypeIdA], [nameB, pTypeIdB]) => nameA.localeCompare(nameB) || compareTypeId(pTypeIdA, pTypeIdB), | ||
typeIdA.p, | ||
typeIdB.p | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess these strings could be constants somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eh, they're encoded into the type as literals; you can't use any other strings. Not really worth making them constants IMO. And they're local to this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is getting pretty gnarly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, might pull it apart in another PR another time. There's still likely to be a biggish file though, I'd keep all the type derivation functions together.
At least it's not 2.5MB like TypeScript! 😂
@sordina Yes, the fields are sorted before they are put into the TypeId, so the original field ordering does not matter. |
Previously the way we named types was messy. We tried to use the name the user had given the type in code, and had a bunch of heuristics we used to try to ensure that duplicate type names wouldn't happen. For example, if the type was imported from a file other than the entrypoint file, we always prepended the file path to the type name. However, this was only done to object types and not to relaxed types, so imported relaxed types could have name conflicts (less of an issue when the entire type is relaxed and totally unvalidated, but still not great).
Given that we know people are probably going to want to split their functions across multiple files and just re-export them from the entrypoint file, we don't want their type names getting butchered with prefixes unless it is actually necessary (ie. there are two types trying to share the same name). To do this, we need to identify all types in use and name them at the end, where we can prefix those names that have multiple types trying to use the name with a disambiguating prefix.
Solution Design
In order to do this, we need to be able to uniquely identify each type, so we can detect when two unique types desire the same name. Turns out this is actually a massive pain in the ass because TypeScript is structurally typed (unlike NDC, which is nominally typed), and so TypeScript really doesn't care about giving types unique IDs. So we had to come up with our own unique identifier for a type. The strategy is as follows:
The type can be one of six different kinds of type (types of different kinds are (obviously) not the same type):
(This algorithm can be found in
inference.ts/makeUniqueTypeIdentifier
)As we encounter types, we register them with their unique ID and the name they'd prefer to have (
inference.ts/CustomTypeNameRegistry.registerUniqueType
). For types with a symbol, that'll be the symbol name, but for anonymous types we'll generate one based on how they're used (function name+param name, object type name+property name, etc).Then, when we're done discovering all types, we determine the final names for all unique types (
inference.ts/CustomTypeNameRegistry.determineFinalTypeNames
), and replace all instances of the unique type name with the final name (inference.ts/applyFinalTypeNamesToFunctionsSchema
).When two (or more) types desire the same preferred name, nobody gets the preferred name (to help prevent that changing arbitrarily as the code is changed), and they all get a prefixed name. The prefix is derived from the path the source file where the type is declared. The path of the source file is truncated to only include the path from the project root. The project root is where the package.json file is (
inference.ts/getProjectRootDirectory
). If somehow two unique types still end up with same name somehow, then this is detected and a number is added to the end to force disambiguation.Anonymous type naming has been changed to produce shorter names (
inference.ts/generateTypeNameFromTypePath
). Instead of being constructed from the full path from the function root to the type, they are now named after where the anonymous type is immediately used. This produces a shorter, more readable preferred type name. The three expected places to find anonymous types right now are:functionName_paramName
)functionName_output
)objectTypePreferredName_propertyName
)Anonymous type naming can result in unexpected type name changes where the same anonymous type (ie same unique id) is used in multiple places. The preferred name allocated will be derived from the first place that anonymous type is encountered. As such, it is recommended that users name all their types using aliases to retain stable type names.
Tests
All existing tests still pass, only a few existing tests have been modified. Namely:
basic-inference.test.ts
- Anonymous types used now reflect the new naming conventionrelaxed-types.test.ts
- An error message gets a more readable error that reflects the type name in TypeScript not the preferred namenaming-conflicts
tests have been moved into the newtype-naming
test suite and renamed toimported-types
.New tests have been added in the
type-naming
test suite.JIRA: NLC-3