-
Notifications
You must be signed in to change notification settings - Fork 124
DO-2075 Added fenix and desktop baseline city seen tables #7974
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
sql/moz-fx-data-shared-prod/fenix_derived/clients_city_seen_v1/schema.yaml
Outdated
Show resolved
Hide resolved
This comment has been minimized.
This comment has been minimized.
sql/moz-fx-data-shared-prod/fenix_derived/clients_city_seen_v1/metadata.yaml
Outdated
Show resolved
Hide resolved
sql/moz-fx-data-shared-prod/fenix_derived/clients_city_seen_v1/schema.yaml
Outdated
Show resolved
Hide resolved
sql/moz-fx-data-shared-prod/fenix_derived/clients_city_seen_v1/schema.yaml
Outdated
Show resolved
Hide resolved
sql/moz-fx-data-shared-prod/firefox_desktop_derived/clients_city_seen_v1/query.sql
Outdated
Show resolved
Hide resolved
sql/moz-fx-data-shared-prod/firefox_desktop_derived/clients_city_seen_v1/query.sql
Outdated
Show resolved
Hide resolved
sql/moz-fx-data-shared-prod/fenix_derived/clients_city_seen_v1/metadata.yaml
Outdated
Show resolved
Hide resolved
sql/moz-fx-data-shared-prod/fenix_derived/clients_city_seen_v1/schema.yaml
Outdated
Show resolved
Hide resolved
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
64349bd to
dc75f81
Compare
This comment has been minimized.
This comment has been minimized.
ce4fe26 to
ff85557
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
@BenWu @soGaussian could you do a final review/approval on this PR? I’d like to merge early next week. @BenWu is there a time of day you recommend for the merge so initialization doesn’t affect artifact deployments? Thanks! |
|
I'll take a closer look on Monday. Running it towards the end of the day should be fine so it doesn't block any fixes that need to go out. I didn't see your comment on how long it would take but it should be fine to do the initialization. I don't think running in parallel would make it much faster since the query is large enough to use up most of the slots. I just did a test and a 1% query uses up all 2k slots in the backfill reservation for basically the whole duration of the query. One thing that we should do is make the initialization run in the backfill reservation instead of analysis-and-etl so there isn't slot contention with etl. I can make that change on Monday. Also Kiran is planning on doing a large backfill of events_first_seen next week so make sure to coordinate with her so the backfills don't end up fighting for slots |
Thank you! I will check with kiran. The time it takes to run 2% sample: firefox_desktop: ~319s, fenix: ~111s. |
sql_generators/baseline_clients_city_seen_v1/templates/metadata.yaml
Outdated
Show resolved
Hide resolved
sql_generators/baseline_clients_city_seen_v1/templates/query.sql
Outdated
Show resolved
Hide resolved
| if name | ||
| in ConfigLoader.get( | ||
| "generate", "baseline_clients_city_seen", "apps", fallback=[] | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming the allowlist is just for this initial phase, when you add the rest of the apps, you'll also need to check if the app has baseline pings since glean.js and server glean doesn't use them. That doesn't need to be done now but might be needed later.
One way to do that would be (I didn't test this):
"glean-core" in requests.get(
f"https://probeinfo.telemetry.mozilla.org/glean/{info[0]['v1_name']}/dependencies"
).json()glean-usage downloads the schemas but that's slower if you only need to check baseline.
…a.yaml Co-authored-by: Ben Wu <12437227+BenWu@users.noreply.github.com>
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Integration report for "Fix dag tag"
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The query template looks good to me
|
@BenWu if i can get an approval from your end perhaps I can merge today? Or would you prefer to merge this one? Thanks for the thorough review! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll need to delete the existing baseline_clients_city_seen_v1 tables to make the initialization run. You can merge this towards the end of the day and make a note in the data platform slack channel that artifact deployment will be blocked for a few hours
Description
Initialize the *baseline_city_seen tables by deriving each client’s first-seen and last-seen city, subdivision and country fields from the stable tables.
Note: This one-time initialization logic will no longer apply once city/subdivision/country fields are nulled in the stable tables.
Ongoing updates: After initialization, the tables will be updated daily via ETL using live tables (appending new clients and advancing last-seen values).
Related Tickets & Documents
Reviewer, please follow this checklist