Skip to content

Commit bc9ffe2

Browse files
authored
Add enterprise-wide code scanning alerts for Enterprise Server and GHAE (#3)
* start work on ghes/ghae support * add csv files to gitignore * add enterprise report function * add enterprise-scope code scanning reporting * update readme * add dependency review check * mess with line length in linter * mess with linter * still messing with linter
1 parent d15982a commit bc9ffe2

File tree

8 files changed

+217
-42
lines changed

8 files changed

+217
-42
lines changed

.github/linters/.markdownlint.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
{
2-
"MD013": false,
2+
"line-length": false,
33
"MD033": { "allowed_elements": ["br"] }
44
}
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
name: "Dependency Review"
2+
on: [pull_request]
3+
4+
permissions:
5+
contents: read
6+
7+
jobs:
8+
dependency-review:
9+
runs-on: ubuntu-latest
10+
steps:
11+
- name: "Checkout Repository"
12+
uses: actions/checkout@v3
13+
- name: "Dependency Review"
14+
uses: actions/dependency-review-action@v1

.github/workflows/linter.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -47,12 +47,13 @@ jobs:
4747
# Run Linter against code base #
4848
################################
4949
- name: Lint Code Base
50-
uses: github/super-linter@v4
50+
uses: github/super-linter/slim@v4
5151
env:
5252
VALIDATE_ALL_CODEBASE: false
5353
DEFAULT_BRANCH: main
5454
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
5555
VALIDATE_DOCKERFILE_HADOLINT: true
5656
VALIDATE_GITHUB_ACTIONS: true
5757
VALIDATE_MARKDOWN: true
58+
MARKDOWN_CONFIG_FILE: .markdownlint.json
5859
VALIDATE_PYTHON_BLACK: true

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -130,3 +130,6 @@ dmypy.json
130130

131131
# Notes, etc.
132132
swap.md
133+
134+
# CSV files
135+
*.csv

README.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ An example of use is below. Note that the custom inputs, such as if you are wan
3232

3333
```yaml
3434
- name: CSV export
35-
uses: some-natalie/ghas-to-csv@v0.2.0
35+
uses: some-natalie/ghas-to-csv@v0.3.0
3636
env:
3737
GITHUB_PAT: ${{ secrets.PAT }} # if you need to set a custom PAT
3838
- name: Upload CSV
@@ -43,21 +43,17 @@ An example of use is below. Note that the custom inputs, such as if you are wan
4343
if-no-files-found: error
4444
```
4545
46-
## But it doesn't do THIS THING
47-
48-
The API docs are [here](https://docs.github.com/en/enterprise-cloud@latest) and pull requests are welcome! :heart:
49-
5046
## Reporting
5147
5248
| | GitHub Enterprise Cloud | GitHub Enterprise Server (3.4) | GitHub AE (M2) | Notes |
5349
| --- | --- | --- | --- | --- |
5450
| Secret scanning | :white_check_mark: Repo<br>:white_check_mark: Org<br>:white_check_mark: Enterprise | :white_check_mark: Repo<br>:white_check_mark: Org<br>:white_check_mark: Enterprise | :white_check_mark: Repo<br>:x: Org<br>:x: Enterprise | [API docs](https://docs.github.com/en/enterprise-cloud@latest/rest/reference/secret-scanning) |
55-
| Code scanning | :white_check_mark: Repo<br>:white_check_mark: Org<br>:x: Enterprise | :white_check_mark: Repo<br>:x: Org<br>:x: Enterprise | :white_check_mark: Repo<br>:x: Org<br>:x: Enterprise | [API docs](https://docs.github.com/en/enterprise-cloud@latest/rest/reference/code-scanning) |
51+
| Code scanning | :white_check_mark: Repo<br>:white_check_mark: Org<br>:x: Enterprise | :white_check_mark: Repo<br>:x: Org<br>:curly_loop: Enterprise | :white_check_mark: Repo<br>:x: Org<br>:curly_loop: Enterprise | [API docs](https://docs.github.com/en/enterprise-cloud@latest/rest/reference/code-scanning) |
5652
| Dependabot | :x: | :x: | :x: | Waiting on [this API](https://github.com/github/roadmap/issues/495) to :ship: |
5753
5854
:information_source: All of this reporting requires either public repositories or a GitHub Advanced Security license.
5955
60-
:information_source: Any item with a :curly_loop: needs some looping logic, since repositories are supported and not higher-level ownership (like orgs or enterprises). How this looks won't differ much between GHAE or GHES. In both cases, you'll need an enterprise admin PAT to access the `all_organizations.csv` or `all_repositories.csv` report from `stafftools/reports`, then looping over it in the appropriate scope. That will tell you about the existence of everything, but not give you permission to access it. To do that, you'll need to use `ghe-org-admin-promote` in GHES ([link](https://docs.github.com/en/enterprise-server@3.4/admin/configuration/configuring-your-enterprise/command-line-utilities#ghe-org-admin-promote))
56+
:information_source: Any item with a :curly_loop: needs some looping logic, since repositories are supported and not higher-level ownership (like orgs or enterprises). How this looks won't differ much between GHAE or GHES. In both cases, you'll need an enterprise admin PAT to access the `all_organizations.csv` or `all_repositories.csv` report from `stafftools/reports`, then looping over it in the appropriate scope. That will tell you about the existence of everything, but not give you permission to access it. To do that, you'll need to use `ghe-org-admin-promote` in GHES ([link](https://docs.github.com/en/enterprise-server@latest/admin/configuration/configuring-your-enterprise/command-line-utilities#ghe-org-admin-promote)) to own all organizations within the server.
6157

6258
## Using this with Flat Data
6359

@@ -79,7 +75,7 @@ jobs:
7975
- name: Check out repo
8076
uses: actions/checkout@v3
8177
- name: CSV export
82-
uses: some-natalie/ghas-to-csv@v0.2.0
78+
uses: some-natalie/ghas-to-csv@v0.3.0
8379
env:
8480
GITHUB_PAT: ${{ secrets.PAT }} # needed if not running against the current repository
8581
SCOPE_NAME: "OWNER-NAME/REPO-NAME" # repository name, needed only if not running against the current repository
@@ -121,6 +117,10 @@ jobs:
121117
nginx-pid/
122118
```
123119

124-
## Notes
120+
## But it doesn't do THIS THING
121+
122+
The API docs are [here](https://docs.github.com/en/enterprise-cloud@latest) and pull requests are welcome! :heart:
123+
124+
## Other notes
125125

126126
[GitHub Copilot](https://copilot.github.com/) wrote most of the Python code in this project. I mostly just structured the files/functions, wrote some docstrings, accounted for the differences in API versions across the products, and edited what it gave me. :heart:

main.py

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@
1717
"""
1818

1919
# Import modules
20-
from src import code_scanning, secret_scanning
20+
from src import code_scanning, enterprise, secret_scanning
2121
import os
2222

2323
# Read in config values
@@ -26,6 +26,11 @@
2626
else:
2727
api_endpoint = os.environ.get("GITHUB_API_ENDPOINT")
2828

29+
if os.environ.get("GITHUB_SERVER_URL") is None:
30+
url = "https://github.com"
31+
else:
32+
url = os.environ.get("GITHUB_SERVER_URL")
33+
2934
if os.environ.get("GITHUB_PAT") is None:
3035
github_pat = os.environ.get("GITHUB_TOKEN")
3136
else:
@@ -49,6 +54,13 @@
4954
api_endpoint, github_pat, scope_name
5055
)
5156
secret_scanning.write_enterprise_secrets_list(secrets_list)
57+
# code scanning
58+
if enterprise.get_enterprise_version(api_endpoint) != "GHEC":
59+
repo_list = enterprise.get_repo_report(url, github_pat)
60+
cs_list = code_scanning.list_enterprise_code_scanning_alerts(
61+
api_endpoint, github_pat, repo_list
62+
)
63+
code_scanning.write_enterprise_cs_list(cs_list)
5264
elif report_scope == "organization":
5365
# code scanning
5466
cs_list = code_scanning.list_org_code_scanning_alerts(

src/code_scanning.py

Lines changed: 123 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -22,26 +22,20 @@ def list_repo_code_scanning_alerts(api_endpoint, github_pat, repo_name):
2222
url = "{}/repos/{}/code-scanning/alerts?per_page=100&page=1".format(
2323
api_endpoint, repo_name
2424
)
25-
response = requests.get(
26-
url,
27-
headers={
28-
"Authorization": "token {}".format(github_pat),
29-
"Accept": "application/vnd.github.v3+json",
30-
},
31-
)
25+
headers = {
26+
"Authorization": "token {}".format(github_pat),
27+
"Accept": "application/vnd.github.v3+json",
28+
}
29+
response = requests.get(url, headers=headers)
30+
if response.status_code == 404:
31+
return "need permission to access,{}".format(repo_name) # don't have permission
32+
if response.status_code == 403:
33+
return "need to enable GHAS,{}".format(repo_name) # no GHAS
3234
response_json = response.json()
3335
while "next" in response.links.keys():
34-
response = requests.get(
35-
response.links["next"]["url"],
36-
headers={
37-
"Authorization": "token {}".format(github_pat),
38-
"Accept": "application/vnd.github.v3+json",
39-
},
40-
)
36+
response = requests.get(response.links["next"]["url"], headers=headers)
4137
response_json.extend(response.json())
4238

43-
print("Found {} code scanning alerts in {}".format(len(response_json), repo_name))
44-
4539
# Return code scanning alerts
4640
return response_json
4741

@@ -131,22 +125,14 @@ def list_org_code_scanning_alerts(api_endpoint, github_pat, org_name):
131125
url = "{}/orgs/{}/code-scanning/alerts?per_page=100&page=1".format(
132126
api_endpoint, org_name
133127
)
134-
response = requests.get(
135-
url,
136-
headers={
137-
"Authorization": "token {}".format(github_pat),
138-
"Accept": "application/vnd.github.v3+json",
139-
},
140-
)
128+
headers = {
129+
"Authorization": "token {}".format(github_pat),
130+
"Accept": "application/vnd.github.v3+json",
131+
}
132+
response = requests.get(url, headers=headers)
141133
response_json = response.json()
142134
while "next" in response.links.keys():
143-
response = requests.get(
144-
response.links["next"]["url"],
145-
headers={
146-
"Authorization": "token {}".format(github_pat),
147-
"Accept": "application/vnd.github.v3+json",
148-
},
149-
)
135+
response = requests.get(response.links["next"]["url"], headers=headers)
150136
response_json.extend(response.json())
151137

152138
print("Found {} code scanning alerts in {}".format(len(response_json), org_name))
@@ -235,3 +221,110 @@ def write_org_cs_list(cs_list):
235221
str(cs["repository"]["private"]),
236222
]
237223
)
224+
225+
226+
def list_enterprise_code_scanning_alerts(api_endpoint, github_pat, repo_list):
227+
"""
228+
Get a list of all code scanning alerts on a given enterprise.
229+
230+
Inputs:
231+
- API endpoint (for GHES/GHAE compatibility)
232+
- PAT of appropriate scope
233+
- Repository list in "org/repo" format (from enterprise.get_repo_report)
234+
235+
Outputs:
236+
- List of _all_ code scanning alerts in enterprise that PAT user can access
237+
238+
Notes:
239+
- Use `ghe-org-admin-promote` to gain ownership of all organizations.
240+
- Personal repos will not be reported on, as they cannot use code scanning.
241+
"""
242+
243+
alerts = []
244+
while True:
245+
try:
246+
repo_name = next(repo_list) # skip the header by putting this up front
247+
alerts.append(
248+
list_repo_code_scanning_alerts(api_endpoint, github_pat, repo_name)
249+
)
250+
except StopIteration:
251+
break
252+
except Exception as e:
253+
print(e)
254+
return alerts
255+
256+
257+
def write_enterprise_cs_list(cs_list):
258+
"""
259+
Write a list of code scanning alerts to a csv file.
260+
261+
Inputs:
262+
- List from list_enterprise_code_scanning_alerts function, which contains
263+
strings and lists of dictionaries for the alerts.
264+
265+
Outputs:
266+
- CSV file of code scanning alerts
267+
- CSV file of repositories not accessible or without code scanning enabled
268+
"""
269+
270+
for alert_list in cs_list:
271+
if type(alert_list) == list:
272+
print(alert_list)
273+
with open("cs_list.csv", "a") as f:
274+
writer = csv.writer(f)
275+
writer.writerow(
276+
[
277+
"number",
278+
"created_at",
279+
"html_url",
280+
"state",
281+
"fixed_at",
282+
"dismissed_by",
283+
"dismissed_at",
284+
"dismissed_reason",
285+
"rule_id",
286+
"rule_severity",
287+
"rule_tags",
288+
"rule_description",
289+
"rule_name",
290+
"tool_name",
291+
"tool_version",
292+
"most_recent_instance_ref",
293+
"most_recent_instance_state",
294+
"most_recent_instance_sha",
295+
"instances_url",
296+
]
297+
)
298+
for cs in alert_list: # loop through each alert in the list
299+
if cs["state"] == "open":
300+
cs["fixed_at"] = "none"
301+
cs["dismissed_by"] = "none"
302+
cs["dismissed_at"] = "none"
303+
cs["dismissed_reason"] = "none"
304+
writer.writerow(
305+
[
306+
cs["number"],
307+
cs["created_at"],
308+
cs["html_url"],
309+
cs["state"],
310+
cs["fixed_at"],
311+
cs["dismissed_by"],
312+
cs["dismissed_at"],
313+
cs["dismissed_reason"],
314+
cs["rule"]["id"],
315+
cs["rule"]["severity"],
316+
cs["rule"]["tags"],
317+
cs["rule"]["description"],
318+
cs["rule"]["name"],
319+
cs["tool"]["name"],
320+
cs["tool"]["version"],
321+
cs["most_recent_instance"]["ref"],
322+
cs["most_recent_instance"]["state"],
323+
cs["most_recent_instance"]["commit_sha"],
324+
cs["instances_url"],
325+
]
326+
)
327+
else:
328+
with open("excluded_repos.csv", "a") as g:
329+
writer = csv.writer(g)
330+
writer.writerow([alert_list])

src/enterprise.py

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
# This holds all the logic for the various enterprise differences.
2+
3+
# Imports
4+
import csv
5+
from time import sleep
6+
import requests
7+
8+
9+
def get_enterprise_version(api_endpoint):
10+
"""
11+
Get the version of GitHub Enterprise. It'll be used to account for
12+
differences between GHES and GHAE and GHEC, like the organization secret
13+
scanning API not existing outside GHEC.
14+
15+
GitHub AE returns "GitHub AE" as of M2
16+
GHES returns the version of GHES that's installed (e.g. "3.4.0")
17+
"""
18+
if api_endpoint != "https://api.github.com":
19+
url = "{}/meta".format(api_endpoint)
20+
response = requests.get(url)
21+
if "installed_version" in response.json():
22+
return response.json()["installed_version"]
23+
else:
24+
return "unknown version of GitHub"
25+
else:
26+
return "GHEC"
27+
28+
29+
def get_repo_report(url, github_pat):
30+
"""
31+
Get the `all_repositories.csv` report from GHES / GHAE.
32+
"""
33+
headers = {
34+
"Accept": "application/vnd.github.v3+json",
35+
"Authorization": "token {}".format(github_pat),
36+
}
37+
url = "{}/stafftools/reports/all_repositories.csv".format(url)
38+
response = requests.get(url, headers=headers)
39+
if response.status_code == 202: # report needs to be generated
40+
while response.status_code == 202:
41+
print("Waiting a minute for the report to be generated ...")
42+
sleep(60)
43+
response = requests.get(url, headers=headers)
44+
elif response.status_code == 200: # report is ready
45+
print("Report is ready! Reading it now ...")
46+
for row in csv.reader(response.text.splitlines()): # skip user repos
47+
if row[2] == "Organization":
48+
yield "{}/{}".format(row[3], row[5])
49+
else:
50+
pass
51+
else: # something went wrong with fetching the report
52+
exit("Error: {}".format(response.status_code))

0 commit comments

Comments
 (0)