-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Visualize pipeline objects in notebook #2241
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>
Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>
Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>
Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>
Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>
Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>
Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>
Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>
Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>
Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>
Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>
Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>
Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>
Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>
Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>
…at/umd-viz-bundle Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>
Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>
…at/umd-viz-bundle Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>
Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>
Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>
Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>
Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>
Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>
Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>
Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>
Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>
Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>
Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>
Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The API does fulfill my needs I'd say 👍🏼 shall we wait until #2268 is merged so that we can do proper QA? Would like to try it on JupyterLab/Jupyter Notebook, VS Code notebooks, marimo, and Databricks.
…at/viz-pipe Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>
Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>
This comment was marked as outdated.
This comment was marked as outdated.
Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>
@astrojuanlu for security reason (accessing localStorage), esm bundle does not allow globalNavigation to be On databricks: |
Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>
Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>
Hi @astrojuanlu , Wow ! Thanks for the quick test and feedback. I can check if we can -
I will fix 1, 2 for now |
Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @ravi-kumar-pilla ,
Does it make sense to separate how we:
- Load Kedro-Viz from a Kedro project via a FastAPI server
- Load Kedro-Viz in a notebook by generating JSON and bundling it using ESM
Currently, notebook-related functions are in data_loader.py and server.py, making these files larger and somewhat out of place. Would it be better to create a new folder under integrations called notebooks and move the visualizer and loader files there for better separation?
Let me know your thoughts!
package/kedro_viz/__init__.py
Outdated
@@ -3,6 +3,9 @@ | |||
import sys | |||
import warnings | |||
|
|||
# alias to ease Notebook visualization import | |||
from .launchers.notebook_visualizer import NotebookVisualizer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should probably not be there but in intergrations/notebook/init.py then users can do
from kedro_viz.intergrations.notebook import NotebookVisualizer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea. I don't have a strong opinion on this. I wanted users to get the NotebookVisualizer
class easily, but I can move it too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an experimental feature, and we know that only a very small percentage of users run Kedro-Viz in notebooks—especially since run_viz was broken for months! If we import it here, it will load every time a user runs Kedro-Viz, even when notebooks aren’t involved, which isn’t ideal.
else: | ||
notebook_user_pipeline = {"__default__": notebook_user_pipeline} | ||
|
||
return catalog, notebook_user_pipeline, session_store, stats_dict # type: ignore[return-value] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make session_store and stat_dict optional properties in populate_data? We're already removing session_store in the ET removal PR, and stat_dict doesn’t seem essential.
Not sure if we should do the same for catalog—do we actually need it in the notebook visualizer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @rashidakanchwala , I think we need DataCatalog to be not optional as we do add_catalog in the following steps and we expect Catalog to be present. For this PR, I am not making the changes to populate_data as it is introducing other changes in DataAccessManager, conftest and other test cases. It would be better we handle this in other ticket which involves moving these methods from server.py as we discussed. wdyt ?
Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>
Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>
Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>
from typing import Dict, Optional, Tuple, Union, cast | ||
|
||
from kedro import __version__ | ||
from kedro.framework.session.store import BaseSessionStore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need session.store. or will we remove it another PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is done for type checking, we can remove it when session_store is removed completely after ET removal
def _load_viz_data(self) -> Optional[Any]: | ||
"""Load pipeline and catalog data for visualization.""" | ||
load_and_populate_data_for_notebook_users(self.pipeline, self.catalog) | ||
return get_kedro_project_json_data() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we completely sure that load_and_populate_data_for_notebook_users
has finished executing before calling get_kedro_project_json_data()
? Do we need any asynchronous handling here?"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
load_and_populate_data_for_notebook_users
does not have any async call to be awaited. Everything is synchronous
html_content = self.generate_html( | ||
json_to_visualize, self.options, self.js_url | ||
) | ||
iframe_content = self._wrap_in_iframe( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to do this after, can we not add heigh/width in generate_html itself
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure about the question but the height and width are used to customize the iframe size. Do you think we should customize the root div
instead ? or both iframe and root div
?
Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>
Description
Resolves #1993
NOTE: The bundle URL will be updated once #2268 is merged
Development notes
NotebookVisualizer
and a methodshow
responsible for visualizing Kedro-Viz using the esm bundle in notebookload_data_for_notebook_users
andload_and_populate_data_for_notebook_users
methods to kedro-viz -> integrations -> notebook ->data_loader.py
QA notes
Testing Results :
Jupyter lab:
Databricks:
Marimo:
VS Code
Checklist
RELEASE.md
file