Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!: check for scheduled update jobs #106

Merged
merged 16 commits into from
Sep 12, 2023

Conversation

FoSix
Copy link
Contributor

@FoSix FoSix commented Aug 4, 2023

Description

Check if there are any upcoming dynamic update jobs within a time window.
BREAKING CHANGE: management and data plane FirewallProxy method have their return values format changed from a dict to a datetime object.

Motivation and Context

An upgrade might fail if an update job kicks in. This check will verify if there are any jobs pending to run in a given time frame (default: 60 minutes).

The time window is calculated based on device's local time (management plane clock).

How Has This Been Tested?

Tested on a device connected to Panorama.

Screenshots (if appropriate)

Types of changes

  • New feature (non-breaking change which adds functionality)

Checklist

  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes if appropriate.
  • All new and existing tests passed.

@FoSix FoSix linked an issue Aug 4, 2023 that may be closed by this pull request
@FoSix FoSix changed the title feat: check for scheduled update jobs feat!: check for scheduled update jobs Aug 4, 2023
examples/readiness_checks/run_readiness_checks.py Outdated Show resolved Hide resolved
if schedule_type.split("-")[1] == "min":
time_distance = 1
details = "every minute"
elif schedule_type.split("-")[1] == "hour":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is also this "at" element for "every-hour", "every-15-mins" and "every-30-mins" schedules, do we ignore that on purpose?

wildfire {
  recurring {
    every-30-mins {
      at 2;
    }
  }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment below

result.reason = "No scheduled job present on the device."
return result

if test_window < 60:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make the min test window lower like 10 or even 5 mins? For example we can use this to check if there are any scheduled jobs just before starting a software upgrade and checking the immediate time window would suffice since it's ok to start the software download/upgrade if a scheduled job exists 15 min later let's say.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did 60 minutes the minimum value on purpose, for several reasons:

  • you cannot predict how long will an update take
  • if it fails and you retry, or downgrade you might hit an update job if the time window is to small
  • if you do multiple upgrades at the same time and you do not use free strategy you might hit an update if the window is to small
  • if you do an HA pair, you would do one check for the whole pair, if you split the checks per each node you might fail the 2nd node and have the pair in an inconsistent state
  • if you do multiple upgrades in one run (9.1 -> 10.0 -> 10.1, etc) you would typically assume that you check once, upgrade multiple times, if you use a short window and check before each upgrade you might brake the run somewhere in between
  • FWs are usually crucial, if you do maintenance windows for an upgrade it lasts several hours, it does not make sense to check next 10 minutes, you would like to have the whole maintenance window clear
  • etc, etc

Hence I've figured it does not make sens to write logic to calculate the 5, 15, 30 minutes intervals as normally you wouldn't and shouldn't use them

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your points, though I'm thinking the library should be flexible as possible. In an ideal environment, I agree that the maintenance window should be clear but I think that won't be easy to achieve. Let's keep it though, we can get back to this later if needed once we have more usage areas.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we should enforce something like best practices, we already do it in couple of places (like minor+1 upgrade strategy). I know you can think about it as a way of forcing a user to some values that he might not be OK with. But I think these are sane defaults. @adambaumeister, @horiagunica - what do you think?

panos_upgrade_assurance/check_firewall.py Outdated Show resolved Hide resolved
panos_upgrade_assurance/check_firewall.py Show resolved Hide resolved
panos_upgrade_assurance/check_firewall.py Outdated Show resolved Hide resolved
panos_upgrade_assurance/check_firewall.py Show resolved Hide resolved
panos_upgrade_assurance/check_firewall.py Outdated Show resolved Hide resolved
panos_upgrade_assurance/check_firewall.py Outdated Show resolved Hide resolved
details = f"at {next_occurrence.time()}"

elif schedule_type == "hourly":
time_distance = 60
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't we calculate the time difference when it's hourly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check this comment

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see but I think we can still calculate it since this is a time difference function, and check_scheduled_updates would still work the same. This is also the current case with the "daily" schedule when there might be less than 60 minutes for the next occurrence we return the exact difference.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we would calculate the exact difference, the only thing that would change would be the way we present the next occurrence in the comments.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand but it looks inconsistent with the "daily" schedule where we provide the exact minutes of difference if it's less than 60 minutes, isn't it?

panos_upgrade_assurance/firewall_proxy.py Outdated Show resolved Hide resolved
panos_upgrade_assurance/firewall_proxy.py Outdated Show resolved Hide resolved
panos_upgrade_assurance/firewall_proxy.py Outdated Show resolved Hide resolved
@github-actions
Copy link

github-actions bot commented Aug 10, 2023

☂️ Python Coverage

current status: ✅

Overall Coverage

Lines Covered Coverage Threshold Status
909 895 98% 95% 🟢

New Files

No new covered files...

Modified Files

File Coverage Status
panos_upgrade_assurance/check_firewall.py 97% 🟢
panos_upgrade_assurance/exceptions.py 100% 🟢
panos_upgrade_assurance/firewall_proxy.py 100% 🟢
panos_upgrade_assurance/utils.py 98% 🟢
TOTAL 99% 🟢

updated for commit: a29352c by action🐍

@github-actions
Copy link

A Preview PR in PanDev repo has been created. You can view it here.

FoSix and others added 2 commits August 10, 2023 11:33
Co-authored-by: Alp Eren Kose <alperenkose@gmail.com>
@FoSix FoSix requested a review from alperenkose August 10, 2023 09:47
Copy link
Collaborator

@alperenkose alperenkose left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some additional comments :)

@FoSix FoSix force-pushed the 88-feat-check-for-scheduled-update-jobs branch from 3fbf3c7 to 49aac4c Compare August 22, 2023 14:35
@FoSix FoSix requested a review from alperenkose August 22, 2023 14:50
Comment on lines 121 to 122
print(f"\n dynamic schedules: {firewall.get_update_schedules()}")
pprint(firewall.get_update_schedules())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have the firewall.get_update_schedules() call twice?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's by mistake

details = f"at {next_occurrence.time()}"

elif schedule_type == "hourly":
time_distance = 60
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand but it looks inconsistent with the "daily" schedule where we provide the exact minutes of difference if it's less than 60 minutes, isn't it?

dict: All dynamic updates schedules, key is the entity type to update, like: threats, wildfire, etc.

```python showLineNumbers title="Sample output, showing values coming from Panorama"
{'@ptpl': 'lab',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if it would be easier to parse with XML response, do you have these '@' and '#text' fields with it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's a sample output in XML:

<response status="success" code="19">
  <result total-count="1" count="1">
    <update-schedule ptpl="interfaces" src="tpl">
      <threats ptpl="interfaces" src="tpl">
        <recurring ptpl="interfaces" src="tpl">
          <sync-to-peer ptpl="interfaces" src="tpl">yes</sync-to-peer>
          <daily ptpl="interfaces" src="tpl">
            <at ptpl="interfaces" src="tpl">15:30</at>
            <action ptpl="interfaces" src="tpl">download-and-install</action>
          </daily>
        </recurring>
      </threats>
      <anti-virus ptpl="interfaces" src="tpl">
        <recurring ptpl="interfaces" src="tpl">
          <daily ptpl="interfaces" src="tpl">
            <at ptpl="interfaces" src="tpl">07:45</at>
            <action ptpl="interfaces" src="tpl">download-and-install</action>
          </daily>
        </recurring>
      </anti-virus>
      <global-protect-clientless-vpn ptpl="interfaces" src="tpl">
        <recurring ptpl="interfaces" src="tpl">
          <hourly ptpl="interfaces" src="tpl">
            <at ptpl="interfaces" src="tpl">0</at>
            <action ptpl="interfaces" src="tpl">download-and-install</action>
          </hourly>
        </recurring>
      </global-protect-clientless-vpn>
      <global-protect-datafile ptpl="interfaces" src="tpl">
        <recurring ptpl="interfaces" src="tpl">
          <weekly ptpl="interfaces" src="tpl">
            <at ptpl="interfaces" src="tpl">00:45</at>
            <day-of-week ptpl="interfaces" src="tpl">sunday</day-of-week>
            <action ptpl="interfaces" src="tpl">download-and-install</action>
          </weekly>
        </recurring>
      </global-protect-datafile>
      <wf-private ptpl="interfaces" src="tpl">
        <recurring ptpl="interfaces" src="tpl">
          <every-5-mins ptpl="interfaces" src="tpl">
            <at ptpl="interfaces" src="tpl">1</at>
            <action ptpl="interfaces" src="tpl">download-and-install</action>
          </every-5-mins>
          <sync-to-peer ptpl="interfaces" src="tpl">yes</sync-to-peer>
        </recurring>
      </wf-private>
      <wildfire ptpl="interfaces" src="tpl">
        <recurring src="tpl">
          <every-min src="tpl">
            <action src="tpl">download-and-install</action>
          </every-min>
        </recurring>
      </wildfire>
    </update-schedule>
  </result>
</response>

I still do not feel cormortable in doind XML. I do prefer dicts. But if you feel it would be more elegant, you can try to refactor that code.

@FoSix FoSix merged commit 4b303f9 into main Sep 12, 2023
7 checks passed
@FoSix FoSix deleted the 88-feat-check-for-scheduled-update-jobs branch September 12, 2023 13:16
github-actions bot pushed a commit that referenced this pull request Sep 13, 2023
## [0.2.0](v0.1.4...v0.2.0) (2023-09-13)

### ⚠ BREAKING CHANGES

* check for scheduled update jobs (#106)

### Features

* check for scheduled update jobs ([#106](#106)) ([4b303f9](4b303f9))
@FoSix
Copy link
Contributor Author

FoSix commented Sep 13, 2023

🎉 This PR is included in version 0.2.0 🎉

The release is available on PyPI and GitHub release

Posted by semantic-release bot

@FoSix FoSix added the released label Sep 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat: check for scheduled update jobs
2 participants