Skip to content
This repository was archived by the owner on Sep 18, 2025. It is now read-only.

Conversation

@dbrrt
Copy link
Collaborator

@dbrrt dbrrt commented Dec 5, 2023

This introduces a new query for EDS alerting requirements. It aims to return urls, with their respective pageviews and p75 cwv for a given url pattern (e.g.: www.adobe.com/express), having a minimal count of cwv events matching the cwv_type passed as a parameter.

This can be helpful to know which URLs with a cwv count (for the cwv_type passed as parameter, e.g.: LCP) greater than the threshold passed as parameter.

Related Issues

https://cq-dev.slack.com/archives/C05A45JBP9N/p1700727862022749

Some adjustements from the initial thread have been made

  • cwv_type added as a parameter to retrieve records with cwv min count (cwv type specific)
  • cwv_count_threshold minimum count threshold to expose results
  • avg_daily_pageviews_factor parameter, aiming to reduce pageviews calculation issues related to sampling

Please ensure your pull request adheres to the following guidelines:

  • make sure to link the related issues in this description
  • when merging / squashing, make sure the fixed issue references are visible in the commits, for easy compilation of release notes

Copy link
Contributor

@trieloff trieloff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really clear what this query is supposed to be doing.

Comment on lines 47 to 53
WHERE
url LIKE CONCAT("https://", @url, "%")
OR url LIKE CONCAT(
"https://%", @repo, "--", @owner, ".hlx%/"
) OR (@url = "-" AND @repo = "-" AND @owner = "-")
GROUP BY id
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not work/is superfluous. helix_rum.EVENTS_V3 already filters by URL. And @owner and @repo aren't working.

ON filtered_data.url = cru.url
WHERE
cwv_count > @cwv_count_threshold
AND (ce.events * cru.weight) > @interval * @sampling_noise_factor
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does @sampling_noise_factor do?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose is to define a minimal average pageviews per day to only return records over the threshold for the given interval.

pageviews > interval * 1000 is up to debate. We arbitrarily chose 1000, just to be on the safe side (sampling noise).

That being said, sampling_noise_factor is a terrible variable name, I'll rename this variable to avg_daily_pageviews_factor instead to make it clearer, and give the consumers to fine tune it accordingly.

@dbrrt
Copy link
Collaborator Author

dbrrt commented Dec 5, 2023

I'm not really clear what this query is supposed to be doing.

This is fair, this use case is quite specific, I've added more context to the PR description.

@dbrrt
Copy link
Collaborator Author

dbrrt commented Dec 5, 2023

Closing in favor of #1013

@dbrrt dbrrt closed this Dec 5, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants