- 
                Notifications
    You must be signed in to change notification settings 
- Fork 11
Transfer to v3 #1024
base: main
Are you sure you want to change the base?
Transfer to v3 #1024
Conversation
| This PR will trigger a minor release when merged. | 
| Codecov ReportAll modified and coverable lines are covered by tests ✅ 
 Additional details and impacted files@@            Coverage Diff            @@
##              main     #1024   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files            6         6           
  Lines          710       710           
=========================================
  Hits           710       710           ☔ View full report in Codecov by Sentry. | 
| I believe that not every PR merged into  | 
| Please run SQLFluff. | 
| Given the move to  | 
| 
 Sounds good, separate PR makes sense, no reason to combine here. My overarching goal is to ensure queries used by data.aem.live and queries used to feed spacecat and eventually DaaS -- especially those related to page view estimates -- are based on common logic so that we maintain consistency. I agree with you this is a separate topic. | 
0effbdc    to
    c39dc76      
    Compare
  
    | source, | ||
| COUNT(id) AS ids, | ||
| COUNT(DISTINCT url) AS pages, | ||
| APPROX_TOP_COUNT(url, 1)[OFFSET(0)].value AS topurl, | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see a slack thread started by @shsteimer in #aem-data-desk on Jan 3.  I believe that grouping by source and taking the top url is hiding data.  Might make more sense to group by source+url combination, or group by url and take the top source (referer).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't add _V4 in any of the built-in queries, make sure sqlfluff runs cleanly.
| ELSE TIMESTAMP_TRUNC(time, DAY) | ||
| END AS date | ||
| FROM helix_rum.PAGEVIEWS_V3( | ||
| FROM helix_rum.PAGEVIEWS_V4( | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not create PAGEVIEWS_V4 and I object strongly to using it in any of the default queries as it is hiding data.
| MAX(weight) AS pageviews | ||
| FROM | ||
| `helix-225321.helix_rum.EVENTS_V4`( | ||
| net.host(@url), @offset, @interval, '-', '-', 'UTC', 'all', @domainkey | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- @urlshould have the same semantics as the rest of the- @urlparameters, so URL prefix without- https://, optional path and- $or- ?delimiters.
- if you list startdateandenddatein the supported parameters, you should also pass them through here
- same for timezone
- same for device
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, the same data should come from PAGEVIEWS_V4 – why does this query exist when rum-pageviews is there already?
A minimal set of data endpoints necessary to run https://data.aem.live/rum-dashboard
Also I updated rum-pageviews to use Events_V4; because the version that uses Events_V3 cannot resolve www.petplace.com to petplace.com
When we use Events_V4 which prepends www and checks for results for a url, www.petplace.com and petplace.com can appear.
@trieloff
@langswei
@dbrrt