-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Visualise Pipeline
objects
#1993
Comments
@astrojuanlu interesting use case. Have you seen a lot that users define pipelines in notebooks or import them to there? I thought vast majority of notebook usage is to do |
I have not, and probably the reason is that traditionally Kedro had taken sort of an anti-notebook stance. We evolved that in 2023, for example by writing https://docs.kedro.org/en/stable/notebooks_and_ipython/notebook-example/add_kedro_to_a_notebook.html I've personally found it very handy to explain things to data scientists with notebooks when teaching. See for example https://github.com/ibis-project/kedro-ibis-tutorial/blob/main/03%20-%20First%20Steps%20with%20Kedro.ipynb, recording (very well received) or https://github.com/astrojuanlu/kedro-databricks-demo/blob/main/First%20Steps%20with%20Kedro%20on%20Databricks.ipynb (essentially the same thing, but with a
We launched a feature earlier this year to do something like that https://docs.kedro.org/en/stable/notebooks_and_ipython/kedro_and_notebooks.html#load-node-line-magic it's for nodes rather than full pipelines though.
That's our impression too yes (and in fact I do that all the time). So this issue would be about taking that one little step further. |
A user just asked about this. |
Pipeline
objects directly in notebooksPipeline
objects
(And it had nothing to do with notebooks) |
Hello, I add some context for my use-case after sending a message on Slack. |
Prior art: #1668 (comment) |
Hi @astrojuanlu , Did some experimental implementation and it seems to be feasible 💯 . Haven't tested complex parts to start off with. But the simple pipelines seems achievable with some limitations. I will be doing some more testing before documenting the limitations. Thank you |
Fantastic @ravi-kumar-pilla ! So #2241 basically launches a Viz server and then embeds that as an iframe, right? Do you think it's feasible to do this using only the frontend React component, without a server? To reduce overhead and have better control of what's presented. For example, it would be nice if the left toolbar, the node filter area, and the other toolbar weren't even displayed. |
You are right.
Yes, I am thinking about this as well. We can either inject a config header and hide parts of viz or as you said we can totally go with react component. I am exploring on this too. I will update on this. Thank you |
Hi @astrojuanlu , I tried using KedroViz directly in HTML but we do not bundle KedroViz to be used directly via a CDN link (or I could not find a way to use the package that way). I tried locally and there seems to be some compatibility issues. I reached out to @Huongg and she will have a look at the issue. For now, I tried config on top of starting server. This would be a first attempt at this feature. We can improve on the performance at later stages. If the bundling approach takes time, I would suggest we go with the Screenshot after configuring only flowchart view : Thank you |
Thanks a lot @ravi-kumar-pilla !
Indeed, I don't see a bundled version in https://cdn.jsdelivr.net/npm/@quantumblack/kedro-viz/ What would be the cost of doing it?
Could you describe them a bit more? I know the UI would look the same in either case but probably the DX is going to be much better if we avoid the server. A server needs to allocate a port, needs the proper Python dependencies installed, etc. I think we need to continue exploring the feasability of doing a JS-only solution. |
Yesterday we briefly discussed this. @ravi-kumar-pilla clarified that with the current proposal (#2241), even if we do use only the frontend, the user would still need to install Kedro Viz anyway. Logging my current understanding of the situation: From https://github.com/kedro-org/kedro-viz-standalone/blob/main/src/App.js, all that's needed is to go from a However, this is easier said than done. For starters, I couldn't find a schema that defines what properties are expected in that JSON - although they can be derived from other pleaces. The constructor suggests that there are 4 mandatory ones kedro-viz/src/components/app/app.js Lines 89 to 93 in e418ecd
but actually the response returned by the API has a few more kedro-viz/package/kedro_viz/api/rest/responses/pipelines.py Lines 203 to 209 in e418ecd
(this is a Pydantic model) This, in turn, is generated here kedro-viz/package/kedro_viz/api/rest/responses/pipelines.py Lines 212 to 238 in e418ecd
which gets populated here
In other words: the logic to transform a Python pipeline into the expected JSON structure is complex as it stands now. I think this is taking me again to kedro-org/kedro#4363, which has several use cases, possibly including this one. In the meantime, as part of the spike @ravi-kumar-pilla could you keep exploring the bundling issues just in case? And describe what you've found in the meantime. |
Hi @astrojuanlu , Thank you for the comment. You are 💯 correct on -
Regarding the bundling, I have resolved the issue and the bundle can be used directly in html. However, the generated html works well in browser but have issues with jupyter notebook. I will try fixing the issues today (some issues are around Once this is done, I will try to see what is the bare minimum requirements needed to get this working. As you mentioned in the comment, since we generate json via Kedro-Viz, we need to install kedro-viz. Instead of the complex viz backend. On a side note, if we can use Thank you |
Hi @astrojuanlu , I am able to display KedroViz using the bundle approach inside notebook (pending additional testing), but we can see a demo implementation in the PR. Documenting the current approach and local testing methodology for reference. Current Approach:
# [TODO: will add options to display certain parts of viz, for now the default is only chart view.
# We can also add more customization if needed]
def visualize(self, pipeline: Pipeline, catalog: DataCatalog = None, embed_in_notebook=True):
Pre-requisites:
# Assuming you have webpack from package.json of kedroViz.
# If not already installed do npm install webpack
npx webpack --mode development
# This will create a viz bundle. The bundle needs to be served for now as it is not published
# Use a local server for publishing. Navigate to the bundle folder `/dist` and run
python -m http.server 8000
# Make sure http://localhost:8000/kedroViz.bundle.js is accessible
jupyter notebook --generate-config
# Go to config file path and add at end of the file
c.ContentsManager.allow_hidden = True NOTE: The html content is currently saved to a file which is placed under Testing Current Approach: # In case of demo_project
cd demo_project
kedro jupyter notebook
# Run each cell present in demo-project/viz_jupyter_test.ipynb of the PR or
# Instantiate your pipeline and execute below code in the jupyter cell
from kedro_viz.launchers.experimental_viz import KedroVizNotebook
KedroVizNotebook().visualize(pipe) Other approaches: Questions: Some questions related to testing and expectations from MVP or first draft: iv. Since this was a spike, I did not test complex pipelines but I hope the approach works. Can we assume testing the demo_project size pipeline a success for first draft ? Next steps:
Thank you |
Thanks a lot for the update @ravi-kumar-pilla !
That's what I had in mind (this or the
Yes! |
In the interest of time boxing this effort and ship incremental improvements towards the final goal, for now let's focus the on having the Webpack bundle introduced in #2241 be part of the normal Kedro Viz release flow. In parallel, we can show the current PoC to users to gather their early feedback. |
Sounds good @astrojuanlu . I will introduce the bundling into the current workflow. |
Bringing part of the discussion on #2256 here: Looks like there are some present challenges with the bundling approach #2256 (comment) @rashidakanchwala commented:
|
Hi @astrojuanlu , I had a discussion with Rashida and we agreed on shipping this feature as experimental using the production bundle. Here is what we will do to ship the feature in the next release -
Let me know what you think of the approach. Thank you |
@ravi-kumar-pilla Let's proceed 👍🏼 |
Originally #1459, extra context in #1833 (comment) reproduced below:
I am showcasing Kedro concepts on a notebook without creating a full-fledged project. Took https://github.com/ibis-project/kedro-ibis-tutorial/blob/main/03%20-%20First%20Steps%20with%20Kedro.ipynb as inspiration, and adapted it to Spark and Databricks (will try to publish that soon).
However, since there is no Kedro Framework project, there is no way I can visualise my pipelines, even though I have a
Pipeline
object perfectly defined:It would be insanely awesome if I could do
KedroViz().visualize(pipe).show()
or something like that, without ever needing to set-up a Kedro project.The text was updated successfully, but these errors were encountered: