-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SNOW-1906607: Fix telemetry collection for all SnowflakePlan #2967
base: main
Are you sure you want to change the base?
SNOW-1906607: Fix telemetry collection for all SnowflakePlan #2967
Conversation
api_calls[0][TelemetryField.THREAD_IDENTIFIER.value] = threading.get_ident() | ||
except Exception: | ||
pass | ||
api_calls[0][TelemetryField.THREAD_IDENTIFIER.value] = threading.get_ident() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would threading.get_ident() throw an exception under any case? if yes, let's keep the try catch part
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is method of in built python module threading
which is unlikely to fail under. Only issue I could find relating to this throwing an exception is this: python/cpython#128189 which looks like it was more of a user error than a library error.
CompilationStageTelemetryField.QUERY_PLAN_HEIGHT.value | ||
] = plan_state[PlanState.PLAN_HEIGHT] | ||
api_calls[0][ | ||
CompilationStageTelemetryField.QUERY_PLAN_NUM_SELECTS_WITH_COMPLEXITY_MERGED.value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sfc-gh-aalam what is the kind of dicrepancy you are referring to in the slack thread https://snowflake.slack.com/archives/C03MJ5AA8CS/p1738347392255399?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In our telemetry collected for job etl logs, we have sql statements submitted by snowpark client which have failed with error codes we are interested in, but the corresponding telemetry for the same plan uuid does not exist in client telemtery being sent from this part of code. This is a decorator -df_collect_api_telemetry
is only put on select few apis and is not applied on all functions. As a result, will miss some important cases like dataframe.write.save_as_table
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated, but do we think that any of the existing logic in the decorator may also be relevant to not-currently-decorated functions like save_as_table
?
@@ -765,6 +773,45 @@ def get_result_set( | |||
|
|||
return result, result_meta | |||
|
|||
def send_plan_metrics_telemetry(self, plan: SnowflakePlan) -> None: | |||
"""Extract the SnowflakePlan's metrics and including plan_state, uuid identifiers, complexity | |||
classification breakdown, and complexity score. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we get a parameter her to guard the sending of the new telemetry? it now occurs on the critical pass of the query execution, and we do now know if there will be any side effect introduced
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there should ideally be no side effects but having a param protection sounds reasonable given how unstable we have been recently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks Afroz.
Which Jira issue is this PR addressing? Make sure that there is an accompanying issue to your PR.
Fixes SNOW-1906607
Fill out the following pre-review checklist:
Please describe how your code solves the related issue.
Currently snowflake plan metrics related telemetry is only collected for apis that are tagged with
df_collect_api_telemetry
. This PR fixes this issue and collects telemetry for all snowflake plans under typesnowpark_compilation_stage_statistics
.