-
Notifications
You must be signed in to change notification settings - Fork 472
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Surface failures when installing App Store apps #25514
Comments
From @iansltx Potential fixes:
|
FWIW if we decide (likely) the scope of this ticket is solution (1) above (activities for failures), we should split the additional recovery items (again, (3) is my preference here as the admin experience is much better) into (an)other issue(s) so they don't get lost. On the topic of activities, we have two categories of failures when installing apps: validation steps prior to queueing the install request and the install process itself. The former returns a 4xx/5xx error when calling the API on a one-off install, while the latter shows up in the activity feed. The catch with VPP installs is there are a lot more reasons for a validation error than there are for FMA/custom packages, and since policy installs are automated actions we don't have the abilit yto "just tell the API client their action has failed." This means that adding activities for validation failures only really needs to happen for failures triggered as part of a policy automation; we don't generate an activity when someone gets a 4xx back when calling the host software install endpoint, and we should keep on not doing that; we've already notified the client of the error. Additionally, the only "validation failure" for custom package/FMA installs that's relevant is when an install is already pending, and silently dropping a duplicate install request is already the correct behavior, so we can keep the scope of this to only VPP installs only in the context of policy automations. Other note from today's call: we'll want a different activity type under the hood for this than for standard install failures, as these failures are early enough that we don't have a command ID to associate the install with yet. We can use the same "failed to install" copy in the UI, but instead of an install ID (which gets used to fetch install details in the UI on an endpoint unrelated to activities) we can include the reason the install queue attempt failed, and can surface which policy triggered the install not only in the activity API response (which we do for software installs and script runs already) but also the UI, along with remediation instructions ("enroll the host in MDM" or " |
I've added the following to this ticket:
The copy in the modals is assuming we're going to do 3. I do have some questions on the copy, however. If we're going to do that, how often would the cron job run? Like what would an admin's expectation be? Once they've made an update to the host or purchased new licenses when would the install then happen? |
For MDM enrollment, with some effort we can hook the enrollment process to clear failed VPP policy automations, so that part wouldn't be on a cron. For VPP license counts, if we're going to manage those we're going to need some more UI/API changes, as we should show available license count and last-updated-at on that license count on the title page. Could make an argument for needing a resync button here but my guess is that the cron running in the background with the ability to manually trigger via the API is enough for now. Thinking we handle this hourly, which is the same frequency as policy updates normally. We would potentially make this cron interval configurable; I'll have to check what we do for other crons. FWIW this probably makes sense to split into three tickets:
as items 2 and 3 are a little bit of a heavier lift to get right, and if item 1 was the only on that made 4.64 for some reason that would still be an improvement and we could do items 2/3 in any order, potentially in parallel. Since this is already a story, we can promote this to an Epic and add those as subtasks so no big deal, but noting so we're clear that the full fix is (a much better admin experience but) nontrivial. |
Per today's design review:
|
I've updated the following modals with copy and links: Added updated error messages And this pull request for the redirect:: |
Updated ticke to include link to PR for the reinstalling apps redirect. Also updated dev notes to match in the figma. |
Pulled this back into "In progress" in order to review error messages for self-service install. To review: FYI @noahtalerman |
FYI @mostlikelee I ending up opening a PR for the API and activity changes. It was easier to show everything changing in one place (one PR). Moved this story to "Ready to spec" |
@mostlikelee just a reminder that this user story is ready to spec. Can you please work with @jmwatts to complete the TODOs in the "Engineering" and "Test plan" sections? Thanks! |
Hey @mostlikelee just a reminder that this user story is ready to spec and estimation is tomorrow! Can you please complete the TODOs in the "Engineering" and "Test plan" sections? |
@noahtalerman Just to clarify, would you like us to define expected behavior for the "Self service activity feed", "Manual Install or API call MDM off" and "Manual Install or API call lack of licenses" scenarios? |
@jmwatts I think up to you! |
Goal
Key result
Auto-update (patch) any software without writing custom policies
Original requests
Context
Changes
Product
Engineering
QA
Risk assessment
Test plan
MDM
Lack of licenses
MDM off and lack of licenses
Self-service MDM off
Self-service lack of licenses
Manual Install or API call MDM off
TO DO
Manual Install or API call lack of licenses
TO DO
Host counts (when are we showing Installed, Pending, Failed)
Host details activity feed
Testing notes
Confirmation
The text was updated successfully, but these errors were encountered: