Visualize pipeline objects in notebook #2241

ravi-kumar-pilla · 2025-01-15T19:55:00Z

Description

Resolves #1993

NOTE: The bundle URL will be updated once #2268 is merged

Development notes

Created a class NotebookVisualizer and a method show responsible for visualizing Kedro-Viz using the esm bundle in notebook
Added load_data_for_notebook_users and load_and_populate_data_for_notebook_users methods to kedro-viz -> integrations -> notebook -> data_loader.py
Added initialize and reset methods in data_access_manager for reuse
Added few utility functions and classes
Update release note, tests and gcp load balancer doc link
Removed few GraphQL checks from workflow as they were failing

QA notes

All tests should pass
For manual testing, open a jupyter notebook and try -

from kedro.pipeline import pipeline, node

def dummy(ds1):
   return ds1
    
n0 = node(dummy, 'flights', 'processed_flights')
dummy_pipe = pipeline([n0])

from kedro_viz.integrations.notebook import NotebookVisualizer
NotebookVisualizer(dummy_pipe).show()

You can also test demo_project pipelines, try -

from kedro_viz.integrations.notebook import NotebookVisualizer
from demo_project.pipeline_registry import register_pipelines
demo_pipe = register_pipelines()

# Since globalNavigation depends on localStorage, the option is not working.
 
NotebookVisualizer(pipeline=demo_pipe, options={ "display": {
                "expandPipelinesBtn": False,
                "exportBtn": False,
                "labelBtn": False,
                "layerBtn": False,
                "metadataPanel": True,
                "miniMap": False,
                "sidebar": False,
                "zoomToolbar": False,
            },
            "expandAllPipelines": False,
            "behaviour": { 
                "reFocus": False,
            },
            "theme": "dark",
            "width": "100%",
            "height": "600px",   
            }).show()

Testing Results :

Jupyter lab:

Databricks:

Marimo:

VS Code

Checklist

Read the contributing guidelines
Opened this PR as a 'Draft Pull Request' if it is work-in-progress
Updated the documentation to reflect the code changes
Added new entries to the RELEASE.md file
Added tests to cover my changes

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

…at/umd-viz-bundle Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

…at/umd-viz-bundle Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

astrojuanlu

The API does fulfill my needs I'd say 👍🏼 shall we wait until #2268 is merged so that we can do proper QA? Would like to try it on JupyterLab/Jupyter Notebook, VS Code notebooks, marimo, and Databricks.

…at/viz-pipe Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

ravi-kumar-pilla · 2025-02-11T17:06:15Z

@astrojuanlu for security reason (accessing localStorage), esm bundle does not allow globalNavigation to be True. Though for our use case this is not a blocker but if it is an issue, we can fix it later or go with umd. I restored globalNavigation and it should work now.

On databricks:

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

astrojuanlu · 2025-02-11T17:33:27Z

Last commit fixed things it seems.

VS Code

This is how it looks like for me, and I have a big screen (1512 x 982). Any chance we can not show the logs? And maybe make the area a bit smaller?

Jupyter Notebook

Same thing

And I'm seeing some warnings and errors in the terminal:

[E 2025-02-11 18:31:21.922 ServerApp] Uncaught exception in write_error
    Traceback (most recent call last):
      File "/Users/juan_cano/Projects/QuantumBlackLabs/tmp/spaceflights/.venv/lib/python3.10/site-packages/tornado/web.py", line 1788, in _execute
        result = method(*self.path_args, **self.path_kwargs)
      File "/Users/juan_cano/Projects/QuantumBlackLabs/tmp/spaceflights/.venv/lib/python3.10/site-packages/tornado/web.py", line 269, in _unimplemented_method
        raise HTTPError(405)
    tornado.web.HTTPError: HTTP 405: Method Not Allowed
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/Users/juan_cano/Projects/QuantumBlackLabs/tmp/spaceflights/.venv/lib/python3.10/site-packages/jupyter_server/extension/handler.py", line 29, in get_template
        template = cast(Template, self.settings[env].get_template(name))  # type:ignore[attr-defined]
      File "/Users/juan_cano/Projects/QuantumBlackLabs/tmp/spaceflights/.venv/lib/python3.10/site-packages/jinja2/environment.py", line 1016, in get_template
        return self._load_template(name, globals)
      File "/Users/juan_cano/Projects/QuantumBlackLabs/tmp/spaceflights/.venv/lib/python3.10/site-packages/jinja2/environment.py", line 975, in _load_template
        template = self.loader.load(self, name, self.make_globals(globals))
      File "/Users/juan_cano/Projects/QuantumBlackLabs/tmp/spaceflights/.venv/lib/python3.10/site-packages/jinja2/loaders.py", line 126, in load
        source, filename, uptodate = self.get_source(environment, name)
      File "/Users/juan_cano/Projects/QuantumBlackLabs/tmp/spaceflights/.venv/lib/python3.10/site-packages/jinja2/loaders.py", line 209, in get_source
        raise TemplateNotFound(
    jinja2.exceptions.TemplateNotFound: '405.html' not found in search path: '/Users/juan_cano/Projects/QuantumBlackLabs/tmp/spaceflights/.venv/lib/python3.10/site-packages/notebook/templates'
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/Users/juan_cano/Projects/QuantumBlackLabs/tmp/spaceflights/.venv/lib/python3.10/site-packages/jupyter_server/base/handlers.py", line 740, in write_error
        html = self.render_template("%s.html" % status_code, **ns)
      File "/Users/juan_cano/Projects/QuantumBlackLabs/tmp/spaceflights/.venv/lib/python3.10/site-packages/jupyter_server/extension/handler.py", line 93, in render_template
        template = cast(Template, self.get_template(name))  # type:ignore[attr-defined]
      File "/Users/juan_cano/Projects/QuantumBlackLabs/tmp/spaceflights/.venv/lib/python3.10/site-packages/jupyter_server/extension/handler.py", line 32, in get_template
        return cast(Template, super().get_template(name))  # type:ignore[misc]
      File "/Users/juan_cano/Projects/QuantumBlackLabs/tmp/spaceflights/.venv/lib/python3.10/site-packages/jupyter_server/base/handlers.py", line 662, in get_template
        return self.settings["jinja2_env"].get_template(name)
      File "/Users/juan_cano/Projects/QuantumBlackLabs/tmp/spaceflights/.venv/lib/python3.10/site-packages/jinja2/environment.py", line 1016, in get_template
        return self._load_template(name, globals)
      File "/Users/juan_cano/Projects/QuantumBlackLabs/tmp/spaceflights/.venv/lib/python3.10/site-packages/jinja2/environment.py", line 975, in _load_template
        template = self.loader.load(self, name, self.make_globals(globals))
      File "/Users/juan_cano/Projects/QuantumBlackLabs/tmp/spaceflights/.venv/lib/python3.10/site-packages/jinja2/loaders.py", line 126, in load
        source, filename, uptodate = self.get_source(environment, name)
      File "/Users/juan_cano/Projects/QuantumBlackLabs/tmp/spaceflights/.venv/lib/python3.10/site-packages/jinja2/loaders.py", line 209, in get_source
        raise TemplateNotFound(
    jinja2.exceptions.TemplateNotFound: '405.html' not found in search paths: '/Users/juan_cano/Projects/QuantumBlackLabs/tmp/spaceflights/.venv/lib/python3.10/site-packages/jupyter_server', '/Users/juan_cano/Projects/QuantumBlackLabs/tmp/spaceflights/.venv/lib/python3.10/site-packages/jupyter_server/templates'
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/Users/juan_cano/Projects/QuantumBlackLabs/tmp/spaceflights/.venv/lib/python3.10/site-packages/tornado/web.py", line 1298, in send_error
        self.write_error(status_code, **kwargs)
      File "/Users/juan_cano/Projects/QuantumBlackLabs/tmp/spaceflights/.venv/lib/python3.10/site-packages/jupyter_server/base/handlers.py", line 742, in write_error
        html = self.render_template("error.html", **ns)
      File "/Users/juan_cano/Projects/QuantumBlackLabs/tmp/spaceflights/.venv/lib/python3.10/site-packages/jupyter_server/extension/handler.py", line 98, in render_template
        return cast(str, template.render(**ns))
      File "/Users/juan_cano/Projects/QuantumBlackLabs/tmp/spaceflights/.venv/lib/python3.10/site-packages/jinja2/environment.py", line 1295, in render
        self.environment.handle_exception()
      File "/Users/juan_cano/Projects/QuantumBlackLabs/tmp/spaceflights/.venv/lib/python3.10/site-packages/jinja2/environment.py", line 942, in handle_exception
        raise rewrite_traceback_stack(source=source)
      File "/Users/juan_cano/Projects/QuantumBlackLabs/tmp/spaceflights/.venv/lib/python3.10/site-packages/notebook/templates/error.html", line 1, in top-level template code
        <!doctype html><html><head><meta charset="utf-8"><title>{% block title %}{{page_title | e}}{% endblock %}</title>{% block favicon %}<link rel="shortcut icon" type="image/x-icon" href="/static/favicons/favicon.ico">{% endblock %}<script defer="defer" src="{{page_config.fullStaticUrl}}/main.407246dd27aed8010549.js?v=407246dd27aed8010549"></script></head><body class="jp-ThemedContainer">{% block stylesheet %}<style>/* disable initial hide */
      File "/Users/juan_cano/Projects/QuantumBlackLabs/tmp/spaceflights/.venv/lib/python3.10/site-packages/jinja2/environment.py", line 490, in getattr
        return getattr(obj, attribute)
    jinja2.exceptions.UndefinedError: 'page_config' is undefined
[W 2025-02-11 18:31:21.960 JupyterNotebookApp] 405 OPTIONS /notebooks/srcdoc/api/deploy-viz-metadata (@127.0.0.1) 76.06ms referer=None

marimo

This is where it looks best, interestingly enough. Still, maybe the area is too big.

ravi-kumar-pilla · 2025-02-11T17:38:53Z

Hi @astrojuanlu , Wow ! Thanks for the quick test and feedback. I can check if we can -

Customize the height and width accepting it as user options (pretty much doable)
Hiding logs (need to check on this)
The errors on the console - few seem to be from .venv (not sure it is related to the bundle) but there will be some security errors related to localStorage which seem unavoidable for the moment.

I will fix 1, 2 for now
Thank you

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

.github/workflows/lint.yml

rashidakanchwala

Hi @ravi-kumar-pilla ,

Does it make sense to separate how we:

Load Kedro-Viz from a Kedro project via a FastAPI server
Load Kedro-Viz in a notebook by generating JSON and bundling it using ESM

Currently, notebook-related functions are in data_loader.py and server.py, making these files larger and somewhat out of place. Would it be better to create a new folder under integrations called notebooks and move the visualizer and loader files there for better separation?

Let me know your thoughts!

rashidakanchwala · 2025-02-12T13:05:51Z

package/kedro_viz/__init__.py

@@ -3,6 +3,9 @@
 import sys
 import warnings

+# alias to ease Notebook visualization import
+from .launchers.notebook_visualizer import NotebookVisualizer


this should probably not be there but in intergrations/notebook/init.py then users can do

from kedro_viz.intergrations.notebook import NotebookVisualizer

I like the idea. I don't have a strong opinion on this. I wanted users to get the NotebookVisualizer class easily, but I can move it too.

This is an experimental feature, and we know that only a very small percentage of users run Kedro-Viz in notebooks—especially since run_viz was broken for months! If we import it here, it will load every time a user runs Kedro-Viz, even when notebooks aren’t involved, which isn’t ideal.

rashidakanchwala · 2025-02-12T14:54:51Z

package/kedro_viz/integrations/kedro/data_loader.py

+    else:
+        notebook_user_pipeline = {"__default__": notebook_user_pipeline}
+
+    return catalog, notebook_user_pipeline, session_store, stats_dict  # type: ignore[return-value]


Can we make session_store and stat_dict optional properties in populate_data? We're already removing session_store in the ET removal PR, and stat_dict doesn’t seem essential.

Not sure if we should do the same for catalog—do we actually need it in the notebook visualizer?

Hi @rashidakanchwala , I think we need DataCatalog to be not optional as we do add_catalog in the following steps and we expect Catalog to be present. For this PR, I am not making the changes to populate_data as it is introducing other changes in DataAccessManager, conftest and other test cases. It would be better we handle this in other ticket which involves moving these methods from server.py as we discussed. wdyt ?

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

rashidakanchwala · 2025-02-13T11:22:38Z

package/kedro_viz/integrations/notebook/data_loader.py

+from typing import Dict, Optional, Tuple, Union, cast
+
+from kedro import __version__
+from kedro.framework.session.store import BaseSessionStore


do we need session.store. or will we remove it another PR?

This is done for type checking, we can remove it when session_store is removed completely after ET removal

package/kedro_viz/integrations/notebook/data_loader.py

rashidakanchwala · 2025-02-13T11:53:40Z

package/kedro_viz/integrations/notebook/visualizer.py

+    def _load_viz_data(self) -> Optional[Any]:
+        """Load pipeline and catalog data for visualization."""
+        load_and_populate_data_for_notebook_users(self.pipeline, self.catalog)
+        return get_kedro_project_json_data()


Are we completely sure that load_and_populate_data_for_notebook_users has finished executing before calling get_kedro_project_json_data()? Do we need any asynchronous handling here?"

load_and_populate_data_for_notebook_users does not have any async call to be awaited. Everything is synchronous

rashidakanchwala · 2025-02-13T11:55:46Z

package/kedro_viz/integrations/notebook/visualizer.py

+                html_content = self.generate_html(
+                    json_to_visualize, self.options, self.js_url
+                )
+                iframe_content = self._wrap_in_iframe(


do we need to do this after, can we not add heigh/width in generate_html itself

I am not sure about the question but the height and width are used to customize the iframe size. Do you think we should customize the root div instead ? or both iframe and root div ?

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

initial draft

dc31929

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

astrojuanlu mentioned this pull request Jan 16, 2025

Visualise Pipeline objects #1993

Open

ravi-kumar-pilla added 28 commits January 21, 2025 18:13

adding window config for jupyter users

d3448dc

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

working draft

7755e11

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

working final draft

7c264dd

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

working final draft

bf08766

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

clean window pollution

4483342

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

working draft with 2 approaches

fd532ee

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

initial bundle draft

563182f

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

update webpack

e8f7249

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

testing webpack

8b66fec

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

ignore babel for umd

72bcc28

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

testing with published bundle

32632d0

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

tested bundle

d3f9c21

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

merge bundle PR

c148a55

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

optimization code added

f7e10a1

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

Merge branch 'main' of https://github.com/kedro-org/kedro-viz into fe…

5031722

…at/umd-viz-bundle Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

add optimization to prod bundle

06fc82d

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

Merge branch 'main' of https://github.com/kedro-org/kedro-viz into fe…

8298408

…at/umd-viz-bundle Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

add umd to repo

6e5511b

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

v10.3.0

b49109b

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

push umd bundle

fd379f7

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

remove additional commits

d962a9b

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

remove additional commits

7ad4be9

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

add release note

47e2b4b

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

merge main

199a34c

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

add umd bundle

45dd808

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

testing esm module

4f69d7e

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

add esm ref

27cfd9d

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

add esm

0485f45

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

ravi-kumar-pilla added 2 commits February 10, 2025 16:44

add granularity to notebook visualizer

1115c34

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

structured notebook visualizer

05ce8de

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

ravi-kumar-pilla requested a review from astrojuanlu February 11, 2025 02:12

astrojuanlu reviewed Feb 11, 2025

View reviewed changes

ravi-kumar-pilla added 3 commits February 11, 2025 08:35

Merge branch 'main' of https://github.com/kedro-org/kedro-viz into fe…

fb73d92

…at/viz-pipe Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

updated js link

28a228f

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

fix lint

884752c

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

This comment was marked as outdated.

Sign in to view

restore global navigation

67cfa5f

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

ravi-kumar-pilla added 2 commits February 11, 2025 11:21

add default globalNavigation

bb29abf

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

fix cache deprecation

f956451

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

fix based on comments

14c6c7a

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

ravi-kumar-pilla requested a review from astrojuanlu February 11, 2025 21:32

rashidakanchwala reviewed Feb 12, 2025

View reviewed changes

.github/workflows/lint.yml Show resolved Hide resolved

rashidakanchwala requested changes Feb 12, 2025

View reviewed changes

rashidakanchwala reviewed Feb 12, 2025

View reviewed changes

ravi-kumar-pilla added 4 commits February 12, 2025 10:04

address PR comments

8d79192

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

remove unused import

b494af5

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

remove test notebook

8678052

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

fix lint

84f1e07

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

ravi-kumar-pilla requested a review from rashidakanchwala February 12, 2025 16:28

rashidakanchwala reviewed Feb 13, 2025

View reviewed changes

package/kedro_viz/integrations/notebook/data_loader.py Outdated Show resolved Hide resolved

rashidakanchwala reviewed Feb 13, 2025

View reviewed changes

address PR comments2

e7b5239

Signed-off-by: ravi_kumar_pilla <ravi_kumar_pilla@mckinsey.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Visualize pipeline objects in notebook #2241

Visualize pipeline objects in notebook #2241

ravi-kumar-pilla commented Jan 15, 2025 •

edited

Loading

astrojuanlu left a comment

This comment was marked as outdated.

ravi-kumar-pilla commented Feb 11, 2025

astrojuanlu commented Feb 11, 2025

ravi-kumar-pilla commented Feb 11, 2025

rashidakanchwala left a comment

rashidakanchwala Feb 12, 2025

ravi-kumar-pilla Feb 12, 2025

rashidakanchwala Feb 12, 2025

rashidakanchwala Feb 12, 2025

ravi-kumar-pilla Feb 12, 2025

rashidakanchwala Feb 13, 2025

ravi-kumar-pilla Feb 13, 2025

rashidakanchwala Feb 13, 2025

ravi-kumar-pilla Feb 13, 2025

rashidakanchwala Feb 13, 2025

ravi-kumar-pilla Feb 13, 2025

Visualize pipeline objects in notebook #2241

Are you sure you want to change the base?

Visualize pipeline objects in notebook #2241

Conversation

ravi-kumar-pilla commented Jan 15, 2025 • edited Loading

Description

Development notes

QA notes

Checklist

astrojuanlu left a comment

Choose a reason for hiding this comment

This comment was marked as outdated.

ravi-kumar-pilla commented Feb 11, 2025

astrojuanlu commented Feb 11, 2025

VS Code

Jupyter Notebook

marimo

ravi-kumar-pilla commented Feb 11, 2025

rashidakanchwala left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ravi-kumar-pilla commented Jan 15, 2025 •

edited

Loading