-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Logging config for bakeries #92
Comments
It would be awesome if we could make this work via a config file. During testing, I also needed to see That code uses |
My hope is that by setting this in the Dask config, that won't be necessary. As the Dask workers load the Dask config they logging config will automatically be applied (to all the loggers, not just the Dask ones) |
@TomAugspurger Thus far on the bakery side we are taking a 2 pronged logging approach. Individual worker logging is redirected to cloud provider log capture. In the case of AWS this means there are separate logs for each worker used by a flow in the There are a few issues with this approach, all of the worker logs are consolidated and displayed as a single log stream in Prefect Cloud. This can be noisy and difficult to track with very verbose logging. In addition, there is an outstanding issue that prevents log redirection for |
@sharkinsspatial does the second code block at https://docs.dask.org/en/latest/debugging.html?highlight=logging#logs help? I imagine that we could have something like # file: /etc/dask/logging.yaml
logging:
version: 1
handlers:
console:
class: logging.StreamHandler
level: INFO
loggers:
pange_forge:
level: INFO
handlers:
- console
distributed.worker:
level: INFO
handlers:
- console
distributed.scheduler:
level: INFO
handlers:
- console In addition, we would need to check
LMK know if you have other questions about this approach.
I've never used it before, but there is the |
@TomAugspurger I'll continue tracking this here. Recipe module logging level is configurable at flow registration time with this wrapper https://github.com/pangeo-forge/pangeo-forge-prefect/blob/master/pangeo_forge_prefect/flow_manager.py#L58. This works correctly within each cloud provider's log solution (Cloudwatch in the ECS case) but I need to coordinate further with the Prefect team to get a clearer picture of the limitations on log shipping to Prefect Cloud. Currently I am using their extra loggers functionality but it appears that there is an issue with the timing of log packaging and flushing that is preventing entire worker log streams from being written and only a segment of the stream is being shipped to Prefect Cloud. |
This is a followup to #84 (comment).
The pangeo-forge library and bakeries need to coordinate on producing and collecting rich logging information for debugging. As recommended in https://docs.python.org/3/howto/logging.html#configuring-logging-for-a-library, libraries like pangeo-forge shouldn't do any configuration. That's up to the applications (bakeries in our case).
I imagine that each bakery will have some configuration: https://docs.python.org/3/library/logging.config.html (either a file or an in-memory dictionary, whatever is easiest). The tricky part here is that we need to propagate that configuration to all of the Dask workers. https://docs.dask.org/en/latest/debugging.html#logs has documentation on configuring Dask worker logs (what you get with
logging.getLogger("distributed.worker")
). We need to make sure that the worker processes are also configured to log thelogger.getLogger("pangeo_forge")
logs. I believe this is doable by including the configuration for the pangeo-forge logger under thelogging
key in the dask config, but that needs to be confirmed (I have to run now so I can't check it, but I wanted to write this up before forgetting).cc @sharkinsspatial and @CiaranEvans for the bakery side.
The text was updated successfully, but these errors were encountered: