Skip to content

Commit

Permalink
Rocky Linux support added (#18)
Browse files Browse the repository at this point in the history
* refactoring crawler

- split Ubuntu handling from updater/service.py as first to make code more readable
- note: debugging code still in place

Signed-off-by: Christian Otto Stelter <cosinus@unit42.de>

* refactoring crawler

    - split Debian handling from updater/service.py
    - split Alma Linux handling from updater/service.py

Signed-off-by: Christian Otto Stelter <cosinus@unit42.de>

* Added support for crawling Flatcar Container Linux

Signed-off-by: Christian Otto Stelter <cosinus@unit42.de>

* - removed no longer used release_update_check from updater/service.py
- added error message for unsupported distributions
- updated changelog
- updated README.md with supported distributions

Signed-off-by: Christian Otto Stelter <cosinus@unit42.de>

* - make use of loguru for simple configuration of logging
- AlmaLinux crawling not yet fully functional again

Signed-off-by: Christian Otto Stelter <cosinus@unit42.de>

* - AlmaLinux crawling improved - it does not simply fetches the first hit - and working again
- extended requirements.txt for loguru

Signed-off-by: Christian Otto Stelter <cosinus@unit42.de>

* - added Debian 12 aka bookworm to image-sources.yaml
- fixed debug output for checksums

Signed-off-by: Christian Otto Stelter <cosinus@unit42.de>

* - added first try on Fedora
- template not yet finished
- last checksum query must be adjusted for distributions like Fedora
- exporter must be adjusted

Signed-off-by: Christian Otto Stelter <cosinus@unit42.de>

* - adjusted checksum query for Fedora - could be more generic for Distributions with no "minor" release updates
- adjusted database queries - get distribution_version needed for Fedora
- updated template for export

Signed-off-by: Christian Otto Stelter <cosinus@unit42.de>

* - added --debug as argument
- added first try on Fedora Linux Support to crawler
- added Debian Linux 12 aka bookworm to sample image-sources.yaml

Signed-off-by: Christian Otto Stelter <cosinus@unit42.de>

* - removed old branch from Dockerfile

Signed-off-by: Christian Otto Stelter <cosinus@unit42.de>

* - added Rocky Linux for crawler
- added template for Rocky Linux for exporter

Signed-off-by: Christian Otto Stelter <cosinus@unit42.de>

* - corrected template for Rocky Linux

Signed-off-by: Christian Otto Stelter <cosinus@unit42.de>

* Updated Changelog and README

Signed-off-by: Christian Otto Stelter <cosinus@unit42.de>

---------

Signed-off-by: Christian Otto Stelter <cosinus@unit42.de>
Signed-off-by: Christian Otto Stelter <62995499+costelter@users.noreply.github.com>
Co-authored-by: Christian Otto Stelter <62995499+costelter@users.noreply.github.com>
  • Loading branch information
stelterlab and costelter authored Jun 12, 2023
1 parent dc62071 commit 2a0cc87
Show file tree
Hide file tree
Showing 7 changed files with 219 additions and 4 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
- added --debug as argument
- added first try on Fedora Linux Support to crawler
- added Debian Linux 12 aka bookworm to sample image-sources.yaml
- added Rocky Linux

## 2023-06-01
- updated example Dockerfile to new repos
Expand Down
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ Supported distributions:
- AlmaLinux
- Flatcar Container Linux
- Fedora Linux
- Rocky Linux


Note: Flatcar Container Linux offers only zipped images, so a direct upload via OpenStack Image Manager/Glance is not supported (yet).

Expand Down
4 changes: 2 additions & 2 deletions crawler/updater/fedora.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ def get_image_filename(release, images_url):
else:
return None

def get_checksum(release, images_url, images_filename):
def get_checksum(release, images_url, image_filename):
request = requests.get(images_url, allow_redirects=True)
soup = BeautifulSoup(request.text, "html.parser")

Expand All @@ -87,7 +87,7 @@ def get_checksum(release, images_url, images_filename):
# skip comment starting with hash
if re.match('^#', line):
continue
if images_filename in line:
if image_filename in line:
# logger.debug("matched: " + line)
(filename, new_checksum) = line.split(" = ")
# logger.debug("new_checksum: " + new_checksum)
Expand Down
152 changes: 152 additions & 0 deletions crawler/updater/rocky.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
# rocky.py
#
# crawl distribution Rocky Linux

import requests
import re

from crawler.web.generic import url_get_last_modified, url_fetch_content
from crawler.web.directory import web_get_current_image_metadata

from bs4 import BeautifulSoup
from loguru import logger


def build_image_url(release, major_minor, imagefile_name):
if not release["baseURL"].endswith("/"):
base_url = release["baseURL"] + "/" + major_minor + "/"
else:
base_url = release["baseURL"] + major_minor + "/"

# if not versionpath.endswith("/"):
# versionpath = versionpath + "/"

return (
base_url + release["imagepath"] + "/" + imagefile_name
)

def get_metadata(release):
# as specified in image-sources.yaml
# baseURL: https://download.rockylinux.org/pub/rocky/
if not release["baseURL"].endswith("/"):
base_url = release["baseURL"] + "/" + release["name"] + "/"
else:
base_url = release["baseURL"] + release["name"] + "/"

requestURL = base_url + release["imagepath"]
# TODO: make this configurable in image-source.yaml
#
# group(1) contains major version as 9
# group(2) contains full version with minor version number as 9.2
# group(3) contains release date as 20230513
# group(4) contains release date suffix 0 (subversion?)
# ex. Rocky-9-GenericCloud-Base-9.2-20230513.0.x86_64.qcow2

filename_pattern = re.compile(r"Rocky-(\d+)-GenericCloud-Base-(\d+\.\d+)-(\d+).(\d+)\.x86_64\.qcow2")

logger.debug("request_URL: " + requestURL)

request = requests.get(requestURL, allow_redirects=True)
soup = BeautifulSoup(request.text, "html.parser")

for link in soup.find_all("a"):
data = link.get("href")
# logger.debug("data: " + data)

if filename_pattern.search(data):
logger.debug("pattern matched for " + data)
extract = filename_pattern.search(data)
major_minor = extract.group(2)
version = extract.group(3)

release_date = (
version[0:4]
+ "-"
+ version[4:6]
+ "-"
+ version[6:8]
)
logger.debug("url: " + build_image_url(release, major_minor, data))
logger.debug("last version: " + version)
logger.debug("release_date: " + release_date)

return {
"url": build_image_url(release, major_minor, data),
"version": version,
"release_date": release_date,
}

return None

def get_checksum(checksum_url, image_filename):

checksum_list = url_fetch_content(checksum_url)
if checksum_list is None:
return None

for line in checksum_list.splitlines():
# logger.debug("line: " + line)

# skip comment starting with hash
if re.match('^#', line):
continue
if image_filename in line:
# logger.debug("matched: " + line)
(filename, new_checksum) = line.split(" = ")
# logger.debug("new_checksum: " + new_checksum)

return new_checksum

return None

def rocky_update_check(release, last_checksum):
# as specified in image-sources.yaml
# baseURL: https://download.rockylinux.org/pub/rocky/
if not release["baseURL"].endswith("/"):
base_url = release["baseURL"] + "/" + release["name"] + "/"
else:
base_url = release["baseURL"] + release["name"] + "/"

checksum_url = base_url + release["imagepath"] + "/" + release["checksumname"]

logger.debug("checksum_url: " + checksum_url)

# as specified in image-sources.yaml
# imagename: Rocky-9-GenericCloud.latest.x86_64
# extension: qcow2
imagename = release["imagename"] + "." + release["extension"]

logger.debug("imagename: " + imagename)

current_checksum = get_checksum(checksum_url, imagename)

if current_checksum is None:
logger.error(
"no matching checksum found - check image (%s) "
"and checksum filename (%s)" % (imagename, release["checksumname"])
)
return None

logger.debug("current_checksum: " + current_checksum)

# as specified in image-sources.yaml
# algorithm: sha256
current_checksum = release["algorithm"] + ":" + current_checksum

if current_checksum != last_checksum:
logger.debug("current_checksum " + current_checksum + " differs from last_checksum " + last_checksum)

image_metadata = get_metadata(release)
if image_metadata is not None:
logger.debug("got metadata")
update = {}
update["release_date"] = image_metadata["release_date"]
update["url"] = image_metadata["url"]
update["version"] = image_metadata["version"]
update["checksum"] = current_checksum
return update
else:
logger.warning("got no metadata")
return None

return None
5 changes: 5 additions & 0 deletions crawler/updater/service.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@
from crawler.updater.alma import alma_update_check
from crawler.updater.flatcar import flatcar_update_check
from crawler.updater.fedora import fedora_update_check
from crawler.updater.rocky import rocky_update_check



def image_update_service(connection, source):
Expand All @@ -32,6 +34,9 @@ def image_update_service(connection, source):
catalog_update = flatcar_update_check(release, last_checksum)
elif "Fedora" in release["imagename"]:
catalog_update = fedora_update_check(release, last_checksum)
elif "Rocky" in release["imagename"]:
catalog_update = rocky_update_check(release, last_checksum)

else:
logger.error("Unsupported distribution " + source["name"] + " - please check your images-sources.yaml")
raise SystemExit(1)
Expand Down
25 changes: 23 additions & 2 deletions etc/image-sources.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -102,8 +102,7 @@ sources:
extension: img.gz
checksumname: flatcar_production_openstack_image.img.gz.DIGESTS
algorithm: md5
immutable: true


- name: Fedora
vendor: "Fedora Project"
releases:
Expand All @@ -117,3 +116,25 @@ sources:
checksumname: CHECKSUM
algorithm: sha256
limit: 1

- name: RockyLinux
vendor: "Rocky Linux Foundation"
releases:
- name: '8'
codename: 'none'
baseURL: https://download.rockylinux.org/pub/rocky/
imagepath: images/x86_64
imagename: Rocky-8-GenericCloud.latest.x86_64
extension: qcow2
checksumname: CHECKSUM
algorithm: sha256
limit: 1
- name: '9'
codename: 'none'
baseURL: https://download.rockylinux.org/pub/rocky/
imagepath: images/x86_64
imagename: Rocky-9-GenericCloud.latest.x86_64
extension: qcow2
checksumname: CHECKSUM
algorithm: sha256
limit: 1
34 changes: 34 additions & 0 deletions templates/rockylinux.yml.j2
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@

- name: {{ catalog['name'] }} {{ catalog['os_version'] }}
format: qcow2
login: rocky
min_disk: 10
min_ram: 512
status: active
visibility: public
multi: true
meta:
architecture: x86_64
hypervisor_type: qemu
hw_disk_bus: scsi
hw_rng_model: virtio
hw_scsi_model: virtio-scsi
hw_qemu_guest_agent: yes
hw_watchdog_action: reset
replace_frequency: quarterly
hotfix_hours: 0
uuid_validity: last-3
provided_until: none
os_distro: rocky
os_version: '{{ catalog['os_version'] }}'
tags: []
latest_checksum_url: {{ metadata['baseURL'] }}{{ catalog['os_version'] }}/{{ metadata['imagepath'] }}/{{ metadata['checksumname'] }}
latest_url: {{ metadata['baseURL'] }}{{ catalog['os_version'] }}/{{ metadata['imagepath'] }}/{{ metadata['imagename'] }}.{{ metadata['extension'] }}
versions:{% for release_version in catalog['versions'] %}
- version: '{{ release_version }}'
url: {{ catalog['versions'][release_version]['url'] }}
checksum: {{ catalog['versions'][release_version]['checksum'] }}
build_date: {{ catalog['versions'][release_version]['release_date'] }}
image_source: {{ catalog['versions'][release_version]['url'] }}
image_description: https://docs.rockylinux.org/release_notes/
{%- endfor %}

0 comments on commit 2a0cc87

Please sign in to comment.