-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CI for agent example within AI cookbook #28
Conversation
Signed-off-by: Sid Murching <sid.murching@databricks.com>
Signed-off-by: Sid Murching <sid.murching@databricks.com>
@@ -42,10 +42,6 @@ | |||
|
|||
# COMMAND ---------- | |||
|
|||
# MAGIC %run ./utils/install_aptget_package |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This notebook doesn't exist but also seems unnecessary to run
@@ -1,19 +1,3 @@ | |||
# Databricks notebook source |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't import from notebooks - need to make this a normal python file by deleting the "Databricks notebook source" header here
|
||
|
||
def parse_and_extract( | ||
def _parse_and_extract( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making stuff private if it's not used outside this file
|
||
def load_uc_volume_to_delta_table( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This util is hard to test because it does so much, it's easier to break this down into dataframe operations (load files into DF, apply parsing, etc) and test those
@@ -3,11 +3,11 @@ | |||
To start working on this book: | |||
- clone the repo; `cd cookbook` | |||
- use your preferred approach to starting a new python environment | |||
- in that environment, `pip install jupyter-book` | |||
- in that environment, `pip install -r dev/dev_requirements.txt` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding dev requirements needed to run tests, lint, etc
Signed-off-by: Sid Murching <sid.murching@databricks.com>
Signed-off-by: Sid Murching <sid.murching@databricks.com>
@@ -127,7 +121,7 @@ def load_uc_volume_to_delta_table( | |||
display_markdown( | |||
f"### {num_errors} documents had parse errors. Please review.", raw=True | |||
) | |||
display(errors_df) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
display() doesn't work in unit tests, asked folks about it in Slack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thats unfortunate. if we're not writing tests that actually look at what is displayed, can we just monkeypatch this out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Will followup on slack about the util notebook -> util python code change.
return ( | ||
pyspark.sql.SparkSession.builder | ||
.master("local[1]") | ||
# Uncomment the following line for testing on Apple silicon locally |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we create some control flag such that we can run pytest
locally (on apple silicon) and get this to work and then CI can have the flag the other way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call - actually, looks like CI seems to work with this stuff enabled by default too? So will just remove the comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we want to go in this direction, should we make this change for all the util and validator notebooks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think we should - to your point let me also follow up with some links to docs (e.g. https://docs.databricks.com/en/files/workspace-modules.html#work-with-python-and-r-modules) on how to iterate on these modules etc
@@ -127,7 +121,7 @@ def load_uc_volume_to_delta_table( | |||
display_markdown( | |||
f"### {num_errors} documents had parse errors. Please review.", raw=True | |||
) | |||
display(errors_df) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thats unfortunate. if we're not writing tests that actually look at what is displayed, can we just monkeypatch this out?
What changes are proposed in this pull request?
Adds GitHub actions with CI (which runs unit tests) for the AI cookbook, plus a few example unit tests
How is this PR tested?
I'll send follow-up PRs after this to apply lint to cookbook code and also add more unit tests to the
file_loading
module touched in this PR