-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Definitions explanation is not clear or understandable (and other suggestions) #25717
Comments
Additionally, maybe there could be a link out to a page that discusses the use of projects vs deployments. I like how in the next and react documentation that it links out to different sections to discuss potential tradeoffs of one selection vs another. In this discussion of when to use additional projects vs additional deployments:
Company Infrastructure
Infrastructure
Company When teams need complete isolation or different access patterns.
Business Critical Deployment Non-Critical Deployment
Using a single deployment has the following benefits:
And then provide more information on workspaces using the definitions.py files instead of init.py: You need to explicitly tell Dagster where to find your definitions through the workspace.yaml file: load_from:
- python_file: marketing/definitions.py
location_name: marketing_tools
- python_file: finance/definitions.py
location_name: finance_tools |
I didn't quite understand the use of the unpacking operator notation in the definition example: The asterisk * in Python is the "unpacking operator". # Let's say trip_assets contains these assets:
trip_assets = [taxi_trips, taxi_zones, taxi_trips_file]
# And metric_assets contains:
metric_assets = [revenue_by_day, trips_by_day]
# When you use * it "unpacks" the lists:
defs = Definitions(
assets=[*trip_assets, *metric_assets]
)
# This is equivalent to writing:
defs = Definitions(
assets=[
taxi_trips,
taxi_zones,
taxi_trips_file,
revenue_by_day,
trips_by_day
]
) Without the *, you'd get nested lists: # Without unpacking (WRONG):
defs = Definitions(
assets=[trip_assets, metric_assets]
)
# This would be like:
assets=[[taxi_trips, taxi_zones], [revenue_by_day]] # Nested lists!
# With unpacking (CORRECT):
defs = Definitions(
assets=[*trip_assets, *metric_assets]
)
# This correctly flattens to:
assets=[taxi_trips, taxi_zones, revenue_by_day] # Flat list! You'll often see this pattern when you want to combine multiple lists into a single flat list. It's like saying "take everything out of these lists and put them all together in one new list." |
I wish the explanation on os.getenv and EnvVar was a little bit clearer: With os.getenv:
With EnvVar:
It's especially useful for:
|
You seem to have understood the unpacking operator of python quite well, you've correctly explained how it works. (I'm just a rando, not from the Dagster team) |
That was my proposal for the documentation in a callout or side link, etc. regarding the asterisk notation in the example. |
What's the issue or suggestion?
A Definitions object is a set of Dagster definitions available and loadable by Dagster tools.
This is a circular sentence. If a definitions object is a set of Dagster definitions available then what are the Dagster definitions and what makes them available vs not available? It's totally unclear.
Additionally, the added explanation does not really help explain:
The Definitions object is used to assign definitions to a code location, and each code location can only have a single Definitions object. This object maps to one code location. With code locations, users isolate multiple Dagster projects from each other without requiring multiple deployments. You’ll learn more about code locations a bit later in this lesson.
What are code locations, and why can they have only a single Definitions object? Okay so the cardinality between Defintions objects and code locations are 1:1, but that doesn't really explain the rest of it.
Additional information
A Definitions object is like a project manifest for Dagster - it bundles together all the assets, jobs, schedules, and other components that make up a single Dagster project. It's like a menu that tells Dagster exactly what's available to run in this specific project. Each separate project (called a code location) needs its own Definitions object, and you can't have multiple Definitions objects in the same location. This setup lets you keep different Dagster projects completely separate from each other, without needing to set up multiple Dagster deployments.
Why do we need this?
Two main reasons:
Each project has its own Definitions, so they don't interfere with each other.
Message from the maintainers
Impacted by this issue? Give it a 👍! We factor engagement into prioritization.
The text was updated successfully, but these errors were encountered: