Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forward references are not always correctly identified. #23

Closed
strue36 opened this issue Apr 6, 2022 · 17 comments · Fixed by #25
Closed

Forward references are not always correctly identified. #23

strue36 opened this issue Apr 6, 2022 · 17 comments · Fixed by #25

Comments

@strue36
Copy link
Contributor

strue36 commented Apr 6, 2022

I came across an issue with a generated schema that was missing a call to update_forward_refs()

It has an InputType that references another InputType that is later in the same file. The InputsPlugin looks like it tries to check for this by marking any InputType that references another InputType as self_referential.

My schema contains a nested non nullable InputType. This is is of type GraphQLNonNull and we don't check the value.type.of_type which shows that this is actually wrapping a GraphQLInputObjectType.

The reference is correctly written as a string. I'm not 100% sure where the code that determines which type hints should be strings lives.

When looking into this, I also noticed that there is a (presumably) related issue in the test_arkitekt_operations test.

This generates the following class at line 305

class Node(BaseModel):
    typename: Optional[Literal["Node"]] = Field(alias="__typename")
    name: str
    interface: str
    package: str
    description: str
    type: NodeType
    id: str
    args: Optional[List[Optional[ArgPort]]]
    kwargs: Optional[List[Optional[KwargPort]]]
    returns: Optional[List[Optional[ReturnPort]]]

But ArgPort is not defined until line 544

@jhnnsrs
Copy link
Owner

jhnnsrs commented Apr 6, 2022

Ran into the same issue while generating the schema types. I think this should be easily fixed by refactoring the "self_referential" (which are actually forward references ) checks to be happening in the annotation recursion and keeping track of reference of not yet defined types so that they will have forward references. Also I need to refactor and document that part a lot better :D

This btw would be something that would be nice to catch while testing. I was thinking of having tests that actually exec the generated code and would run some simple generation of models to check pydantic can resolve accurately. But I have encountered some NameErrors when trying exec ( i think because of global namespace polution)... Will try and get something to work asap. Any suggestions on strategies? :)

@strue36
Copy link
Contributor Author

strue36 commented Apr 8, 2022

I've had a bit of a play with the testing stuff. I can validate that the generated code is valid by writing to a temporary file and running something like

subprocess.run([sys.executable, file_name])

This still doesn't catch the issue I described because it doesn't try to build each model.

For that I was thinking that we could write a test/series of tests that write the generated files to disk, and for each BaseModel in the file try and build them. (Maybe using hypothesis to automate this)

Do you think we should include the .py files generated from each schema in git?

@jhnnsrs
Copy link
Owner

jhnnsrs commented Apr 8, 2022

@strue36 I would love hypothesis like testing. It is something that i have been meaning to test out for quite some time now. If you have some tests I would love to add them. Also great that subprocess is working. :)

@jhnnsrs
Copy link
Owner

jhnnsrs commented Apr 8, 2022

On the topic of forward references:

  • I am working on a refactor for the class registry where it would keep track of not yet definied classes when they are being referenced and add them to a list of necessary forward references when generating them at the end.

This however would break another idea i had of using a process pool executor for running the plugins in parallel. Parellelization could then only work between projects. I have not stress tested turms with massive schemas, so not really sure if this would be necessary...

Whats your take on that?
(Also BTW: I have been putting my take on using streams as an issue in rath if you would like to give some input on that?) Also happy to have a discussion via zoom or chat ;)

@jhnnsrs
Copy link
Owner

jhnnsrs commented Apr 14, 2022

Accidentally closed this

@jhnnsrs jhnnsrs reopened this Apr 14, 2022
@j-riebe
Copy link
Contributor

j-riebe commented Apr 14, 2022

It has an InputType that references another InputType that is later in the same file. The InputsPlugin looks like it tries to check for this by marking any InputType that references another InputType as self_referential.

There was a similar issue in the ObjectsPlugin.
The problem was, that not all wrapping types have been unraveled while checking for forward references.
It mas missing the while is_wrapping_type() part and performed a hard coded check against NonNullable only.

for value_key, value in object_type.fields.items():
type_ = value.type
while is_wrapping_type(type_):
type_ = type_.of_type
if isinstance(type_, GraphQLObjectType):
self_referential.add(
registry.generate_objecttype_classname(type_.name)
)
elif isinstance(type_, GraphQLInterfaceType):
self_referential.add(
f"{registry.generate_interface_classname(type_.name)}Base"
)

EDIT

Same goes for the InputsPlugin.
@jhnnsrs as this seems like duplicated code, it might be worth to refactor this into a shared function(ality).

if isinstance(value.type, GraphQLNonNull):
if isinstance(value.type.of_type, GraphQLInputObjectType):
self_referential.add(
registry.generate_inputtype_classname(value.type.of_type.name)
)
if isinstance(value.type, GraphQLInputObjectType):
self_referential.add(
registry.generate_inputtype_classname(value.type.name)
)

@jhnnsrs
Copy link
Owner

jhnnsrs commented Apr 14, 2022

@j-riebe Agreed there was a lot of duplicated code there, I discovered however that i might a quite dramatic conceptual error here, the forward reference update needs to be happening on the parent not on the invoked child, i am currently refactoring this part of the mess :D will keep you posted!

@j-riebe
Copy link
Contributor

j-riebe commented Apr 14, 2022

Fortunately, this error didn't hit me yet 😅

@jhnnsrs
Copy link
Owner

jhnnsrs commented Apr 14, 2022

Its this kind of error where you wonder how things ever worked... 😄
Anyhow the functionally of the references now lives within the class registry, that keeps
track of what has been generated before and will add class that references this new field, and
at the very end of the code will generate the forward references.

Works well on linux and windows now, but funnily macos gives me the weirdest error right now, that
needs some troubleshooting, it adds references to classes that have been never defined, will need some
figuring out.

Also there is one breaking change given a test you wrote:

type Foo {
    forward: Ref
}

interface Ref {
  id: ID
}

Will yield non working code. The big querstio: is that actually something that we should consider? As a interface is an abstract type anyways and needs to resolve to a type (correct me if i am wrong). So i am wondering if we should return an error when such a thing happens or actually generate the code?

in the scenario that:

type Foo {
    forward: Ref
}

interface Ref {
  id: ID
}

type X implements Ref {
  x: ID
}

Everything works as expected...

@j-riebe @strue36 what are your thoughs?

@j-riebe
Copy link
Contributor

j-riebe commented Apr 14, 2022

In terms of GraphQL-SDL it should be perfectly fine, to use interfaces without implementations.
They won't provide you any value, but that's a different story.
I could image, that subgraphs in a federated schema might lead to such cases.

What exactly won't be working in the test case?
The exec statement I used, told nothing about syntax or MRO errors like before 😁
This is the result on my machine (you see there is still the error with the wrong forward ref).

from pydantic import BaseModel
from typing import Optional

class RefBase(BaseModel):
    id: Optional[str]

class Foo(BaseModel):
    forward: Optional['Ref']
RefBase.update_forward_refs()

Another question:

What is the Union of the implementations about?
From a python POV, it should be totally fine to use RefBase instead of Union[X] in as a type annotation (which IMHO is what counts).

Why don't we get rid of the Unions?
If you wanted to prevent the interfaces to be instantiated directly, consider adding abc.ABC to the base classes.

@jhnnsrs
Copy link
Owner

jhnnsrs commented Apr 14, 2022

If 'stand alone' interfaces are generally useful the non-union approach should definitely be possible.

It makes me wonder however what actually is are the purposes of generating a pydantic schema generation. What is your use case?

My usecase for code generation is so focused on fragments and operations on the client side, that I am a bit oblivious to what schema generation would be used for.

@j-riebe
Copy link
Contributor

j-riebe commented Apr 15, 2022

Well, my use-case is server-side static typing.

I'm trying to use the pydantic schema together with another code-generation approach for creating ariadne bindables (python's schema first GQL library).
This would allow generating a completely-typed GraphQL-server that only requires wiring via Business-Logic (makes it impossible to forget a resolver).

The pydantic schema is extremly useful for static-typing.
It enables the detection of changes in terms of resolver input and output arguments, e.g. via mypy.

@jhnnsrs
Copy link
Owner

jhnnsrs commented Apr 15, 2022

Ah ariadne :) This was actually something that I thought would make a great plugin and started working on it myself. I was also looking at generating sqlalchemy types from graphql (+ some schema directives). Would you like to join forces on that?

Another use case mentioned in some private requests to me was using the pydantic types to validate their response on the server side and serialize to json then. Something that I think is probably an antipattern, as pydantic validation is unnecessarily long. I think it would be great to have a list of common usecases and antipatterns in the documentation!

@j-riebe
Copy link
Contributor

j-riebe commented Apr 15, 2022

If you speak about sqlalchemy, have a look at SQLModel.
It's sqlalchemy based on pydantic (made by tiangolo from FastAPI) and may work Out-of-the-box already if you choose it as the base type.
Same goes for beanie if MongoDB is your DB of choice.

Speaking about anti-patterns it might be unwise to generate your DB Models based on the Graph, as this couples your API Design with DB-concerns.

@j-riebe
Copy link
Contributor

j-riebe commented Apr 15, 2022

Regarding the ariadne generator, I already thought about rebuilding it as a turms plugin.
Unfortunately, I can't promise you that I find time for this soon.

@jhnnsrs
Copy link
Owner

jhnnsrs commented Apr 15, 2022

In terms of GraphQL-SDL it should be perfectly fine, to use interfaces without implementations. They won't provide you any value, but that's a different story. I could image, that subgraphs in a federated schema might lead to such cases.

What exactly won't be working in the test case? The exec statement I used, told nothing about syntax or MRO errors like before 😁 This is the result on my machine (you see there is still the error with the wrong forward ref).

from pydantic import BaseModel
from typing import Optional

class RefBase(BaseModel):
    id: Optional[str]

class Foo(BaseModel):
    forward: Optional['Ref']
RefBase.update_forward_refs()

Another question:

What is the Union of the implementations about? From a python POV, it should be totally fine to use RefBase instead of Union[X] in as a type annotation (which IMHO is what counts).

Why don't we get rid of the Unions? If you wanted to prevent the interfaces to be instantiated directly, consider adding abc.ABC to the base classes.

#26 should have solved the issue and provide both solutions ( "always_resolve_interfaces" would be a strict mode, where this would raise a Generation Error)

@jhnnsrs
Copy link
Owner

jhnnsrs commented Apr 15, 2022

move to #28

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants