Improved naming of anonymous types and types from files outside of `functions.ts` #21

daniel-chambers · 2024-03-15T07:13:06Z

Previously the way we named types was messy. We tried to use the name the user had given the type in code, and had a bunch of heuristics we used to try to ensure that duplicate type names wouldn't happen. For example, if the type was imported from a file other than the entrypoint file, we always prepended the file path to the type name. However, this was only done to object types and not to relaxed types, so imported relaxed types could have name conflicts (less of an issue when the entire type is relaxed and totally unvalidated, but still not great).

Given that we know people are probably going to want to split their functions across multiple files and just re-export them from the entrypoint file, we don't want their type names getting butchered with prefixes unless it is actually necessary (ie. there are two types trying to share the same name). To do this, we need to identify all types in use and name them at the end, where we can prefix those names that have multiple types trying to use the name with a disambiguating prefix.

Solution Design

In order to do this, we need to be able to uniquely identify each type, so we can detect when two unique types desire the same name. Turns out this is actually a massive pain in the ass because TypeScript is structurally typed (unlike NDC, which is nominally typed), and so TypeScript really doesn't care about giving types unique IDs. So we had to come up with our own unique identifier for a type. The strategy is as follows:

The type can be one of six different kinds of type (types of different kinds are (obviously) not the same type):

An intrinsic type (ie string, number, any, etc) identified by the name of the intrinsic
A literal type, either a string literal, number literal, or bigint literal, identified by the literal value itself
An anonymous object type, identified by the property names and types of the properties
An anonymous union type, identified by the types that make up the union
An anonymous intersection type, identified by the types that make up the intersection
A type with a symbol (ie. a name), identified by the location in the code where the symbol is declared (filename plus position), plus any type parameters used if the type has them.

(This algorithm can be found in inference.ts/makeUniqueTypeIdentifier)

As we encounter types, we register them with their unique ID and the name they'd prefer to have (inference.ts/CustomTypeNameRegistry.registerUniqueType). For types with a symbol, that'll be the symbol name, but for anonymous types we'll generate one based on how they're used (function name+param name, object type name+property name, etc).

Then, when we're done discovering all types, we determine the final names for all unique types (inference.ts/CustomTypeNameRegistry.determineFinalTypeNames), and replace all instances of the unique type name with the final name (inference.ts/applyFinalTypeNamesToFunctionsSchema).

When two (or more) types desire the same preferred name, nobody gets the preferred name (to help prevent that changing arbitrarily as the code is changed), and they all get a prefixed name. The prefix is derived from the path the source file where the type is declared. The path of the source file is truncated to only include the path from the project root. The project root is where the package.json file is (inference.ts/getProjectRootDirectory). If somehow two unique types still end up with same name somehow, then this is detected and a number is added to the end to force disambiguation.

Anonymous type naming has been changed to produce shorter names (inference.ts/generateTypeNameFromTypePath). Instead of being constructed from the full path from the function root to the type, they are now named after where the anonymous type is immediately used. This produces a shorter, more readable preferred type name. The three expected places to find anonymous types right now are:

Function parameters (the generated name is functionName_paramName)
Function return type (the generated name is functionName_output)
Object properties (the generated name is objectTypePreferredName_propertyName)

Anonymous type naming can result in unexpected type name changes where the same anonymous type (ie same unique id) is used in multiple places. The preferred name allocated will be derived from the first place that anonymous type is encountered. As such, it is recommended that users name all their types using aliases to retain stable type names.

Tests

All existing tests still pass, only a few existing tests have been modified. Namely:

basic-inference.test.ts - Anonymous types used now reflect the new naming convention
relaxed-types.test.ts - An error message gets a more readable error that reflects the type name in TypeScript not the preferred name
naming-conflicts tests have been moved into the new type-naming test suite and renamed to imported-types.

New tests have been added in the type-naming test suite.

JIRA: NLC-3

sordina

Checking that two anonymous object types with different field orderings produce the same name?

sordina

This is slick. I'm worried that inference.ts is getting very complicated, but I suppose that is by necessity. Maybe at some point we could split out a "library" for the structural/nominal concerns in order to simplify the business logic aspects of schema inference. No holdup on merging this PR.

CHANGELOG.md

sordina · 2024-03-17T23:16:45Z

ndc-lambda-sdk/src/inference.ts

 function cloneTypeDerivationContext(context: TypeDerivationContext): TypeDerivationContext {
  return {
    objectTypeDefinitions: structuredClone(context.objectTypeDefinitions),
    scalarTypeDefinitions: structuredClone(context.scalarTypeDefinitions),
+    customTypeNameRegistry: context.customTypeNameRegistry.clone(),


Why is this cloned?

Because this cloning function is used in places (relaxed types) where we run the type derivation algorithm but throw away the results, so we clone the state and discard it after.

ndc-lambda-sdk/src/inference.ts

sordina · 2024-03-17T23:32:54Z

ndc-lambda-sdk/src/inference.ts

+  const compareTypeId = (typeIdA: TypeId, typeIdB: TypeId): number => {
+    if (typeIdA.t === "d" && typeIdB.t == "d") {
+      return typeIdA.f.localeCompare(typeIdB.f)
+        || typeIdA.s - typeIdB.s
+        || compareList(compareTypeId, typeIdA.ta, typeIdB.ta);
+    } else if (typeIdA.t === "i" && typeIdB.t == "i") {
+      return typeIdA.i.localeCompare(typeIdB.i);
+    } else if (typeIdA.t === "l-n" && typeIdB.t == "l-n") {
+      return typeIdA.v - typeIdB.v;
+    } else if (typeIdA.t === "l-s" && typeIdB.t == "l-s") {
+      return typeIdA.v.localeCompare(typeIdB.v);
+    } else if (typeIdA.t === "l-bi" && typeIdB.t == "l-bi") {
+      const aStr = `${typeIdA.v.negative ? "-" : ""}${typeIdA.v.base10Value}`;
+      const bStr = `${typeIdB.v.negative ? "-" : ""}${typeIdB.v.base10Value}`;
+      return aStr.localeCompare(bStr);
+    } else if (typeIdA.t === "o" && typeIdB.t == "o") {
+      return compareList(
+        ([nameA, pTypeIdA], [nameB, pTypeIdB]) => nameA.localeCompare(nameB) || compareTypeId(pTypeIdA, pTypeIdB),
+        typeIdA.p,
+        typeIdB.p
+      );


I guess these strings could be constants somewhere?

Eh, they're encoded into the type as literals; you can't use any other strings. Not really worth making them constants IMO. And they're local to this function.

sordina · 2024-03-17T23:36:20Z

ndc-lambda-sdk/src/inference.ts

This file is getting pretty gnarly.

Yeah, might pull it apart in another PR another time. There's still likely to be a biggish file though, I'd keep all the type derivation functions together.

At least it's not 2.5MB like TypeScript! 😂

ndc-lambda-sdk/test/inference/relaxed-types/relaxed-types.test.ts

daniel-chambers · 2024-03-18T00:01:03Z

Checking that two anonymous object types with different field orderings produce the same name?

@sordina Yes, the fields are sorted before they are put into the TypeId, so the original field ordering does not matter.

First cut

0a9819c

daniel-chambers self-assigned this Mar 15, 2024

Updated readme

62c799b

daniel-chambers changed the title ~~Rework naming of types~~ Improved naming of anonymous types and types from files outside of functions.ts Mar 15, 2024

sordina reviewed Mar 17, 2024

View reviewed changes

sordina approved these changes Mar 17, 2024

View reviewed changes

daniel-chambers merged commit 2ab3967 into main Mar 18, 2024
6 checks passed

daniel-chambers deleted the type-naming branch March 18, 2024 00:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improved naming of anonymous types and types from files outside of `functions.ts` #21

Improved naming of anonymous types and types from files outside of `functions.ts` #21

daniel-chambers commented Mar 15, 2024 •

edited

Loading

sordina left a comment

sordina left a comment

sordina Mar 17, 2024

daniel-chambers Mar 17, 2024

sordina Mar 17, 2024

daniel-chambers Mar 17, 2024

sordina Mar 17, 2024

daniel-chambers Mar 18, 2024 •

edited

Loading

daniel-chambers commented Mar 18, 2024 •

edited

Loading

Improved naming of anonymous types and types from files outside of functions.ts #21

Improved naming of anonymous types and types from files outside of functions.ts #21

Conversation

daniel-chambers commented Mar 15, 2024 • edited Loading

Solution Design

Tests

sordina left a comment

Choose a reason for hiding this comment

sordina left a comment

Choose a reason for hiding this comment

sordina Mar 17, 2024

Choose a reason for hiding this comment

daniel-chambers Mar 17, 2024

Choose a reason for hiding this comment

sordina Mar 17, 2024

Choose a reason for hiding this comment

daniel-chambers Mar 17, 2024

Choose a reason for hiding this comment

sordina Mar 17, 2024

Choose a reason for hiding this comment

daniel-chambers Mar 18, 2024 • edited Loading

Choose a reason for hiding this comment

daniel-chambers commented Mar 18, 2024 • edited Loading

Improved naming of anonymous types and types from files outside of `functions.ts` #21

Improved naming of anonymous types and types from files outside of `functions.ts` #21

daniel-chambers commented Mar 15, 2024 •

edited

Loading

daniel-chambers Mar 18, 2024 •

edited

Loading

daniel-chambers commented Mar 18, 2024 •

edited

Loading