You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
When you run the 'edr report' command from a notebook that has elementary installed as a cluster library (so it is installed on start up and persisted across sessions), the report generation will fail on a permission error when trying to run 'dbt deps' if the cluster is in 'shared' access mode. If the cluster is in 'single user' access mode the command will succeed.
To Reproduce
Create an all purpose compute cluster with access mode 'shared'
Install the "elementary-data==1.5.1" from PyPi on it
Connect to a GitHub repo that contains a DBT project opr upload one to your workspace
Create a new Notebook with only one Python cell that contains this command:
Attach the notebook to the create cluster and run the cell
Expected behavior
I expected the the report to be generated at the provided location, just like it does when using a cluster in 'Single-user' mode.
Screenshots
________ __
/ ____/ /__ ____ ___ ___ ____ / /_____ ________ __
/ __/ / / _ \/ __ `__ \/ _ \/ __ \/ __/ __ `/ ___/ / / /
/ /___/ / __/ / / / / / __/ / / / /_/ /_/ / / / /_/ /
/_____/_/\___/_/ /_/ /_/\___/_/ /_/\__/\__,_/_/ \__, /
/____/
Any feedback and suggestions are welcomed! join our community here - https://bit.ly/slack-elementary
2024-07-03 15:09:33 — INFO — Running with edr=0.15.1
2024-07-03 15:09:34 — INFO — Installing packages for edr internal dbt package...
2024-07-03 15:09:34 — INFO — Running dbt --log-format json deps --project-dir /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/elementary/monitor/dbt_project --profiles-dir /Workspace/Repos/<username>/<repo_name>/<path_to_project_folder>
2024-07-03 15:09:40 — INFO — Running with dbt=1.8.3
2024-07-03 15:09:40 — INFO — Encountered an error:
[Errno 13] Permission denied: '/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/elementary/monitor/dbt_project/package-lock.yml'
2024-07-03 15:09:40 — INFO — Traceback (most recent call last):
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/dbt/cli/requires.py", line 138, in wrapper
result, success = func(*args, **kwargs)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/dbt/cli/requires.py", line 101, in wrapper
return func(*args, **kwargs)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/dbt/cli/requires.py", line 201, in wrapper
return func(*args, **kwargs)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/dbt/cli/requires.py", line 247, in wrapper
return func(*args, **kwargs)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/dbt/cli/main.py", line 447, in deps
results = task.run()
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/dbt/task/deps.py", line 217, in run
self.lock()
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/dbt/task/deps.py", line 204, in lock
with open(lock_filepath, "w") as lock_obj:
PermissionError: [Errno 13] Permission denied: '/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/elementary/monitor/dbt_project/package-lock.yml'
Traceback (most recent call last):
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/elementary/clients/dbt/dbt_runner.py", line 88, in _run_command
result = subprocess.run(
File "/usr/lib/python3.10/subprocess.py", line 526, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['dbt', '--log-format', 'json', 'deps', '--project-dir', '/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/elementary/monitor/dbt_project', '--profiles-dir', '/Workspace/Repos/<username>/<repo_name>/<path_to_project_folder>']' returned non-zero exit status 2.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/bin/edr", line 8, in<module>sys.exit(cli())
File "/databricks/python/lib/python3.10/site-packages/click/core.py", line 1128, in __call__
return self.main(*args, **kwargs)
File "/databricks/python/lib/python3.10/site-packages/click/core.py", line 1053, in main
rv = self.invoke(ctx)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/elementary/cli/cli.py", line 67, in invoke
returnsuper().invoke(ctx)
File "/databricks/python/lib/python3.10/site-packages/click/core.py", line 1659, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/databricks/python/lib/python3.10/site-packages/click/core.py", line 1395, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/databricks/python/lib/python3.10/site-packages/click/core.py", line 754, in invoke
return __callback(*args, **kwargs)
File "/databricks/python/lib/python3.10/site-packages/click/decorators.py", line 26, in new_func
returnf(get_current_context(), *args, **kwargs)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/elementary/monitor/cli.py", line 442, in report
data_monitoring = DataMonitoringReport(
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/elementary/monitor/data_monitoring/report/data_monitoring_report.py", line 42, in __init__
super().__init__(
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/elementary/monitor/data_monitoring/data_monitoring.py", line 35, in __init__
self.internal_dbt_runner = self._init_internal_dbt_runner()
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/elementary/monitor/data_monitoring/data_monitoring.py", line 61, in _init_internal_dbt_runner
internal_dbt_runner = DbtRunner(
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/elementary/clients/dbt/dbt_runner.py", line 48, in __init__
self._run_deps_if_needed()
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/elementary/clients/dbt/dbt_runner.py", line 318, in _run_deps_if_needed
self.deps()
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/elementary/clients/dbt/dbt_runner.py", line 116, in deps
success, _ = self._run_command(
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/elementary/clients/dbt/dbt_runner.py", line 99, in _run_command
raise DbtCommandError(err, command_args, logs=logs)
elementary.exceptions.exceptions.DbtCommandError: Failed to run dbt command.
Encountered an error:
[Errno 13] Permission denied: '/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/elementary/monitor/dbt_project/package-lock.yml'
Traceback (most recent call last):
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/dbt/cli/requires.py", line 138, in wrapper
result, success = func(*args, **kwargs)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/dbt/cli/requires.py", line 101, in wrapper
return func(*args, **kwargs)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/dbt/cli/requires.py", line 201, in wrapper
return func(*args, **kwargs)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/dbt/cli/requires.py", line 247, in wrapper
return func(*args, **kwargs)
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/dbt/cli/main.py", line 447, in deps
results = task.run()
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/dbt/task/deps.py", line 217, in run
self.lock()
File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/dbt/task/deps.py", line 204, in lock
with open(lock_filepath, "w") as lock_obj:
PermissionError: [Errno 13] Permission denied: '/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/elementary/monitor/dbt_project/package-lock.yml'
Environment (please complete the following information):
Elementary CLI (edr) version: [e.g. 0.5.3], can be found by running pip show elementary-data
0.15.1
Elementary dbt package version: [e.g. 0.4.1], can be found in packages.yml file
0.15.2
dbt version you're using [e.g. 1.8.1]
1.8.3
Data warehouse [e.g. snowflake]
Databricks
Infrastructure details (e.g. operating system, prod / dev / staging, deployment infra, CI system, etc)
Running on Shared all purpose compute on Azure Databricks
Additional context
I did a bit of debugging and testing. When running the command it seems to check if the dbt packages from the project and from the internal dbt project are installed. The first one succeeds because the project is in a writable location. The seconds fails because it tries to write/create a file called package-lock.yml in the internal dbt project inside the elementary package folder. This folder is not writable on a shared cluster (I am actually surprised that it IS writeable on a single user cluster).
I also tried installing elementary as part of the notebook instead of on cluster startup, like so: %pip install elementary-data=0.15.1. After you restart the Python kernel and run the same command it DOES succeed. This is because the elementary package in this case is installed in a location that is writeable for the logged in user. Unfortunately this is not an option for us as we run our project as a wheel and both elementary and dbt-databricks are installed as part of that wheel.
Maybe it is an idea to have the dbt_packages pre-installed when installing elementary? That way dbt deps won't need to write anything and it would also speed up the process a bit. This might fail when it tries to create a target folder though.
Alternatively, perhaps we can configure the location of all writeable locations (target and dbt_packages) as part of the edr command? Just like we can configure the location of the report output.
Would you be willing to contribute a fix for this issue?
Sure.
The text was updated successfully, but these errors were encountered:
@thijs-nijhuis@noel I have the same behavior with send-report with dbt 1.8.7 (Databricks too) where I try to execute the command within a contianer (no elevation). Did you find a workaround ?
Describe the bug
When you run the 'edr report' command from a notebook that has elementary installed as a cluster library (so it is installed on start up and persisted across sessions), the report generation will fail on a permission error when trying to run 'dbt deps' if the cluster is in 'shared' access mode. If the cluster is in 'single user' access mode the command will succeed.
To Reproduce
Expected behavior
I expected the the report to be generated at the provided location, just like it does when using a cluster in 'Single-user' mode.
Screenshots
Environment (please complete the following information):
pip show elementary-data
packages.yml
fileAdditional context
I did a bit of debugging and testing. When running the command it seems to check if the dbt packages from the project and from the internal dbt project are installed. The first one succeeds because the project is in a writable location. The seconds fails because it tries to write/create a file called package-lock.yml in the internal dbt project inside the elementary package folder. This folder is not writable on a shared cluster (I am actually surprised that it IS writeable on a single user cluster).
I also tried installing elementary as part of the notebook instead of on cluster startup, like so:
%pip install elementary-data=0.15.1
. After you restart the Python kernel and run the same command it DOES succeed. This is because the elementary package in this case is installed in a location that is writeable for the logged in user. Unfortunately this is not an option for us as we run our project as a wheel and both elementary and dbt-databricks are installed as part of that wheel.Maybe it is an idea to have the dbt_packages pre-installed when installing elementary? That way dbt deps won't need to write anything and it would also speed up the process a bit. This might fail when it tries to create a target folder though.
Alternatively, perhaps we can configure the location of all writeable locations (target and dbt_packages) as part of the edr command? Just like we can configure the location of the report output.
Would you be willing to contribute a fix for this issue?
Sure.
The text was updated successfully, but these errors were encountered: