Surface failures when installing App Store apps #25514

eugkuo · 2025-01-16T17:17:33Z

Goal

User story
As an IT admin,
I want to see failures for App Store app installs
so that I can be aware of the issues and work to resolve them.

Key result

Auto-update (patch) any software without writing custom policies

Original requests

Automatically install and scope software within a team #21825

Context

Product designer: @noahtalerman

Changes

Product

Engineering

Test plan is finalized
Feature guide changes: TODO
Database schema migrations: TODO
Load testing: TODO

ℹ️ Please read this issue carefully and understand it. Pay special attention to UI wireframes, especially "dev notes".

QA

Risk assessment

Requires load testing: TODO
Risk level: Low / High TODO
Risk description: TODO

Test plan

MDM

Turn off MDM on a host
Add an app store app with automatic install turned on
Check activity feed on host to see that the policy failed
Click 'view details' to see that the correct modal appears
Click the lean more redirects to make sure the appropriate pages show up

Lack of licenses

In ABM purchase a license for a host
Add the app and select auto-install for hosts
Check host activity feed for failure
Click the lean more redirects to make sure the appropriate pages show up

MDM off and lack of licenses

Turn off MDM and add an app with a single license from ABM with automatical install on.
Check activity feed on host to see that the policy failed becase MDM is off.
Click 'view details' to see that the correct modal appears
Turn on MDM for host and then go to the host detail page to install the app
On the host detail page check the activity feed to make sure that the install failed due to lack of licenses.

Self-service MDM off

Ensure MDM is off for my host
Navigate to My device/Self-service
See that "install" is disabled
Activity feed TO DO (Is this relevant and testable?)

Self-service lack of licenses

Ensure a self-service app is out of licenses
Navigate to My device/Self-service
Click "install" on said application
Ensure error message appears
Activity feed TO DO

Manual Install or API call MDM off
TO DO

Manual Install or API call lack of licenses
TO DO

Host counts (when are we showing Installed, Pending, Failed)

Try to install an App Store app when there are no licenses. Check to make sure the "Failed" count on Software title page is incremented. At no point is the "Pending" count incremented.

Host details activity feed

Try to install an App Store app when there are no licenses. Check to make sure the "failed to install" activity shows up under "Past." At no point is there an install activity under "Upcoming"

Testing notes

Confirmation

Engineer: Added comment to user story confirming successful completion of test plan.
QA: Added comment to user story confirming successful completion of test plan.

eugkuo · 2025-01-16T17:21:57Z

From @iansltx

Potential fixes:

Add a host activity when VPP automations fail due to lack of MDM enrollment (and potentially due to lack of VPP app license)
1. Requires design decisions + FE
  1. Activity format
  2. Which events do we capture of:
    1. MDM not enrolled
    2. VPP licenses exhausted
  3. Level of effort: low-medium
  4. Level of tech debt: low
  5. => Noah: Let’s follow-up in 4.64.
Mark the policy as "---" if it fails but we can’t run the automation (similar to what we do for out-of-scope package install automations)
1. Can probably be eng-spec’d with design sign-off
2. Level of effort: low-medium
3. Level of tech debt: medium (continues a confusing pattern that affects how a primitive behaves)
Clear failed host policy statuses when an action is taken that would remove a barrier to automation running successfully
1. Per-host for any VPP-automated policy: host enrolls in MDM
2. Per-automated-app when we see available licenses go from zero to nonzero
  1. Would require a cron to ping ABM
  2. We’ll be building a VPP-adjacent cron anyway for #24222
3. Precedent already exists for this behavior
  1. Existing behaviors clear all policy statuses, not just failed
  2. Adding or revising a software install or script automation (removing doesn’t affect)
  3. Bringing hosts into label scope for an installer
4. Can probably be eng-spec’d with design sign-off
5. Level of effort: medium
  1. Can split MDM enrollment and license exhaustion scenarios
  2. Level of tech debt: low
    Can do (i) && ( (ii) xor (iii) )

iansltx · 2025-01-18T00:49:39Z

FWIW if we decide (likely) the scope of this ticket is solution (1) above (activities for failures), we should split the additional recovery items (again, (3) is my preference here as the admin experience is much better) into (an)other issue(s) so they don't get lost.

On the topic of activities, we have two categories of failures when installing apps: validation steps prior to queueing the install request and the install process itself. The former returns a 4xx/5xx error when calling the API on a one-off install, while the latter shows up in the activity feed. The catch with VPP installs is there are a lot more reasons for a validation error than there are for FMA/custom packages, and since policy installs are automated actions we don't have the abilit yto "just tell the API client their action has failed."

This means that adding activities for validation failures only really needs to happen for failures triggered as part of a policy automation; we don't generate an activity when someone gets a 4xx back when calling the host software install endpoint, and we should keep on not doing that; we've already notified the client of the error. Additionally, the only "validation failure" for custom package/FMA installs that's relevant is when an install is already pending, and silently dropping a duplicate install request is already the correct behavior, so we can keep the scope of this to only VPP installs only in the context of policy automations.

Other note from today's call: we'll want a different activity type under the hood for this than for standard install failures, as these failures are early enough that we don't have a command ID to associate the install with yet. We can use the same "failed to install" copy in the UI, but instead of an install ID (which gets used to fetch install details in the UI on an endpoint unrelated to activities) we can include the reason the install queue attempt failed, and can surface which policy triggered the install not only in the activity API response (which we do for software installs and script runs already) but also the UI, along with remediation instructions ("enroll the host in MDM" or "~~construct more pylons~~ buy more VPP licenses.")

eugkuo · 2025-01-20T07:42:28Z

I've added the following to this ticket:

Activity feed
Install details modals
- MDM
- Licenses

The copy in the modals is assuming we're going to do 3. I do have some questions on the copy, however. If we're going to do that, how often would the cron job run? Like what would an admin's expectation be? Once they've made an update to the host or purchased new licenses when would the install then happen?

iansltx · 2025-01-20T15:36:30Z

For MDM enrollment, with some effort we can hook the enrollment process to clear failed VPP policy automations, so that part wouldn't be on a cron.

For VPP license counts, if we're going to manage those we're going to need some more UI/API changes, as we should show available license count and last-updated-at on that license count on the title page. Could make an argument for needing a resync button here but my guess is that the cron running in the background with the ability to manually trigger via the API is enough for now. Thinking we handle this hourly, which is the same frequency as policy updates normally. We would potentially make this cron interval configurable; I'll have to check what we do for other crons.

FWIW this probably makes sense to split into three tickets:

Initial activity log changes, with remediation linking to this area of the docs (screenshot as it's still in a PR):

MDM enrollment policy clearing (and update remediation copy for that details page)
VPP license acquisition policy clearing (and update remediation copy for that details page)

as items 2 and 3 are a little bit of a heavier lift to get right, and if item 1 was the only on that made 4.64 for some reason that would still be an improvement and we could do items 2/3 in any order, potentially in parallel.

Since this is already a story, we can promote this to an Epic and add those as subtasks so no big deal, but noting so we're clear that the full fix is (a much better admin experience but) nontrivial.

iansltx · 2025-01-20T19:38:41Z

Per today's design review:

Action item 1 from the above is the scope of this ticket; the other two items wiill get their own FRs and be prioritized independently (EDIT: FRs filed).
Add links from modals to places useful for resolving the problem: ABM's apps page (via a learn-more-about redirect) for license exhaustion, host page from the host name, team-specific software title page from the software title.
Mention in the modals that once an issue is remediated the user can install the VPP app manually, and if they're having to do this remediation for a bunch of hosts once they get the hosts/apps in the right spot see docs (link to the above subheading) for a larger workaroudn.
We should link to ABM when showing the "not enough licenses" flash error on a VPP install.
We are not doing in-modal retry actions.
We are not adding additional activities for manually-initiated VPP installs.
We are not bringing install status tweaks on the host software inventory into scope here.

eugkuo · 2025-01-21T18:02:35Z

I've updated the following modals with copy and links:

Install details modals
- MDM
- Licenses

Added updated error messages

Error messages

And this pull request for the redirect::

Update routes.js #25628

eugkuo · 2025-01-21T21:44:35Z

Updated ticke to include link to PR for the reinstalling apps redirect. Also updated dev notes to match in the figma.

eugkuo · 2025-01-23T19:46:15Z

Pulled this back into "In progress" in order to review error messages for self-service install.

To review:
- Self-service error messages

FYI @noahtalerman

noahtalerman · 2025-01-30T20:25:38Z

REST API changes: See PR here

Activity changes: See PR here

FYI @mostlikelee I ending up opening a PR for the API and activity changes. It was easier to show everything changing in one place (one PR).

Moved this story to "Ready to spec"

noahtalerman · 2025-02-03T14:20:29Z

@mostlikelee just a reminder that this user story is ready to spec. Can you please work with @jmwatts to complete the TODOs in the "Engineering" and "Test plan" sections? Thanks!

noahtalerman · 2025-02-04T14:41:32Z

Hey @mostlikelee just a reminder that this user story is ready to spec and estimation is tomorrow! Can you please complete the TODOs in the "Engineering" and "Test plan" sections?

jmwatts · 2025-02-04T14:57:50Z

@noahtalerman Just to clarify, would you like us to define expected behavior for the "Self service activity feed", "Manual Install or API call MDM off" and "Manual Install or API call lack of licenses" scenarios?

noahtalerman · 2025-02-04T23:19:02Z

Just to clarify, would you like us to define expected behavior for the "Self service activity feed", "Manual Install or API call MDM off" and "Manual Install or API call lack of licenses" scenarios?

@jmwatts I think up to you!

eugkuo added :product Product Design department (shows up on 🦢 Drafting board) story A user story defining an entire feature labels Jan 16, 2025

eugkuo changed the title ~~Surface and address automatic poilicy failures when installing App Store apps~~ Surface and address automatic policy failures when installing App Store apps Jan 16, 2025

This was referenced Jan 16, 2025

Expose automatic install failures in UI if MDM is off or scripts are disabled. #25414

Closed

Automatically install and scope software within a team #21825

Open

noahtalerman assigned eugkuo Jan 16, 2025

jmwatts mentioned this issue Jan 16, 2025

Unclear to admins what happens when an automation fails on a policy #25452

Closed

noahtalerman added Epic DO NOT USE. Auto-created by ZenHub, cannot be disabled. and removed Epic DO NOT USE. Auto-created by ZenHub, cannot be disabled. labels Jan 17, 2025

This was referenced Jan 21, 2025

Clear failed policy statuses for VPP-automated policies when MDM is turned on for a host #25623

Open

Clear failed policy statuses for a VPP-automated policy when we detect the automated app goes from zero available licenses to more than zero #25624

Open

eugkuo added the #g-software Software product group label Jan 23, 2025

noahtalerman assigned noahtalerman and unassigned eugkuo Jan 29, 2025

noahtalerman changed the title ~~Surface and address automatic policy failures when installing App Store apps~~ Surface and address failures when installing App Store apps Jan 30, 2025

noahtalerman changed the title ~~Surface and address failures when installing App Store apps~~ Surface failures when installing App Store apps Jan 30, 2025

noahtalerman assigned mostlikelee and unassigned noahtalerman Jan 30, 2025

noahtalerman mentioned this issue Jan 30, 2025

[API design] Surface failures when installing App Store apps #25907

Open

noahtalerman mentioned this issue Feb 4, 2025

Surface failures when installing custom packages and Fleet-maintained apps #26013

Open

jmwatts mentioned this issue Feb 4, 2025

App Store apps (VPP): create policies for automatic install #23744

Open

34 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Surface failures when installing App Store apps #25514

Surface failures when installing App Store apps #25514

eugkuo commented Jan 16, 2025 •

edited by noahtalerman

Loading

eugkuo commented Jan 16, 2025 •

edited

Loading

iansltx commented Jan 18, 2025 •

edited

Loading

eugkuo commented Jan 20, 2025

iansltx commented Jan 20, 2025 •

edited

Loading

iansltx commented Jan 20, 2025 •

edited

Loading

eugkuo commented Jan 21, 2025 •

edited

Loading

eugkuo commented Jan 21, 2025

eugkuo commented Jan 23, 2025

noahtalerman commented Jan 30, 2025 •

edited

Loading

noahtalerman commented Feb 3, 2025

noahtalerman commented Feb 4, 2025

jmwatts commented Feb 4, 2025

noahtalerman commented Feb 4, 2025

Surface failures when installing App Store apps #25514

Surface failures when installing App Store apps #25514

Comments

eugkuo commented Jan 16, 2025 • edited by noahtalerman Loading

Goal

Key result

Original requests

Context

Changes

Product

Engineering

QA

Risk assessment

Test plan

Testing notes

Confirmation

eugkuo commented Jan 16, 2025 • edited Loading

iansltx commented Jan 18, 2025 • edited Loading

eugkuo commented Jan 20, 2025

iansltx commented Jan 20, 2025 • edited Loading

iansltx commented Jan 20, 2025 • edited Loading

eugkuo commented Jan 21, 2025 • edited Loading

eugkuo commented Jan 21, 2025

eugkuo commented Jan 23, 2025

noahtalerman commented Jan 30, 2025 • edited Loading

noahtalerman commented Feb 3, 2025

noahtalerman commented Feb 4, 2025

jmwatts commented Feb 4, 2025

noahtalerman commented Feb 4, 2025

eugkuo commented Jan 16, 2025 •

edited by noahtalerman

Loading

eugkuo commented Jan 16, 2025 •

edited

Loading

iansltx commented Jan 18, 2025 •

edited

Loading

iansltx commented Jan 20, 2025 •

edited

Loading

iansltx commented Jan 20, 2025 •

edited

Loading

eugkuo commented Jan 21, 2025 •

edited

Loading

noahtalerman commented Jan 30, 2025 •

edited

Loading