-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Amazon Linux 2023 #2692
Conversation
6502d28
to
cf8cc4d
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #2692 +/- ##
========================================
Coverage 76.48% 76.48%
========================================
Files 22 22
Lines 2220 2220
========================================
Hits 1698 1698
Misses 522 522
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
b587772
to
86792ad
Compare
922c9b3
to
e2f924c
Compare
dcaf56d
to
c63edb9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cookbooks/aws-parallelcluster-environment/kitchen.environment-install.yml
Show resolved
Hide resolved
cookbooks/aws-parallelcluster-environment/resources/ec2_udev_rules/ec2_udev_rules_alinux2023.rb
Outdated
Show resolved
Hide resolved
cookbooks/aws-parallelcluster-environment/test/controls/efs_spec.rb
Outdated
Show resolved
Hide resolved
cookbooks/aws-parallelcluster-slurm/resources/dns_domain/dns_domain_alinux2023.rb
Outdated
Show resolved
Hide resolved
@@ -49,6 +49,8 @@ def package_platform | |||
platform_version = node['platform_version'].to_i | |||
if platform_version == 2 | |||
platform_version = 7 | |||
elsif platform_version == 2023 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bad practice: redefinition of the same variable with different meaning.
Here we should have platform_version
and mysql_platform_version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we are using the platform_version
is just the location in our s3 bucket where we archive the packages we want. Similar to what we do for Cinc-client.
archives/mysql/el/9/x86_64/mysql-community-client-*tar.gz
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, the bad practice is doing:
platform_version = define platform version with meaning 1
if platform_version == 2
platform_version = 7
use platform version with meaning 2
Not a blocking comment because the practice is not introduced in this PR, but better to address it at least in a follow up PR
...ooks/aws-parallelcluster-slurm/resources/slurm_dependencies/slurm_dependencies_alinux2023.rb
Outdated
Show resolved
Hide resolved
@@ -37,6 +37,10 @@ if [ -e "${SCRIPT}" ]; then | |||
TOKEN=$(curl -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600") \ | |||
&& curl -H "X-aws-ec2-metadata-token: $TOKEN" -v http://169.254.169.254/latest/meta-data/instance-id | |||
') | |||
echo "Install libxcrypt-compat dmidecode package by using SSH key: ${KITCHEN_SSH_KEY_PATH}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is something required only by AL2023, let's add a comment saying that.
If this is not, why adding it now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is in test code, can we address this later?
configured_ip=`nmcli -t -f IP4.ADDRESS device show ${DEVICE_NAME} | cut -f2 -d':'` | ||
if [ -z "${configured_ip}" ]; then | ||
# Setup connection method to "manual", configure ip address and gateway, only if not already configured. | ||
sudo nmcli connection modify "${con_name}" ipv4.method manual ipv4.addresses ${DEVICE_IP_ADDRESS}/${CIDR_PREFIX_LENGTH} ipv4.gateway ${GW_IP_ADDRESS} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to this comment nmcli
is not supported: https://github.com/aws/aws-parallelcluster/pull/6223/files#r1630201744
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After testing, Amazon Linux 2023 does not require this script because amazon-ec2-net-utils is pre-installed in AL2023 and handles multi-nics instances properly.
test_efa
has passed on multi-nics instances
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazon Linux 2023 does not require this script
So we are removing the whole file, right?
cookbooks/aws-parallelcluster-platform/recipes/install/enable_services.rb
Outdated
Show resolved
Hide resolved
cookbooks/aws-parallelcluster-environment/resources/ec2_udev_rules/ec2_udev_rules_alinux2023.rb
Outdated
Show resolved
Hide resolved
cookbooks/aws-parallelcluster-environment/kitchen.environment-install.yml
Show resolved
Hide resolved
cookbooks/aws-parallelcluster-environment/test/controls/efs_spec.rb
Outdated
Show resolved
Hide resolved
cookbooks/aws-parallelcluster-slurm/resources/dns_domain/dns_domain_alinux2023.rb
Outdated
Show resolved
Hide resolved
action :cloudwatch_prerequisite do | ||
package "gnupg2-full" do | ||
options '--allowerasing' | ||
retries 3 | ||
retry_delay 5 | ||
end | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cookbooks/aws-parallelcluster-platform/spec/unit/recipes/enable_services_spec.rb
Outdated
Show resolved
Hide resolved
cookbooks/aws-parallelcluster-platform/test/controls/enable_services_spec.rb
Outdated
Show resolved
Hide resolved
…flict on Docker but not on EC2
* Changing Package repos test to skip checking Repo name for epel
Inspec test has not supported Amazon Linux 2023. https://docs.chef.io/inspec/platforms/. Therefore, this commit disable the check which was generating false errors
Running this recipe on Alinux 2023 docker generates false failure: https://github.com/aws/aws-parallelcluster-cookbook/actions/runs/9373643185/job/25807894209?pr=2692 Signed-off-by: Hanwen <hanwenli@amazon.com>
Signed-off-by: Hanwen <hanwenli@amazon.com>
Signed-off-by: Hanwen <hanwenli@amazon.com>
Signed-off-by: Hanwen <hanwenli@amazon.com>
rsyslog is required to have Amazon Linux 2023 writes to messages log. The messages log is uploaded to CloudWatch. Therefore, this commits move the installation and start of rsyslog to CloudWatch recipe and removes the standalone recipe to enable rsyslog. Signed-off-by: Hanwen <hanwenli@amazon.com>
amazon-ec2-net-utils is pre-installed in AL2023 and handles multi-nics instances properly Signed-off-by: Hanwen <hanwenli@amazon.com>
Signed-off-by: Hanwen <hanwenli@amazon.com>
@@ -50,7 +50,9 @@ | |||
cwd '/tmp' | |||
code <<-CUDA | |||
set -e | |||
./cuda.run --silent --toolkit --samples | |||
mkdir /cuda-install | |||
./cuda.run --silent --toolkit --samples --tmpdir=/cuda-install |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/cuda-install
is meant to be a temporary directory, so we should create it under /tmp
.
That said it's fine to use recursive deletion of that folder since it's a temp directory created by us and not meant ot be used by the user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/tmp
on some OSes has size limit. That's why we had to change the directory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then please add a comment on top of it to explain why we are not using tmp dir.
However, we should be in control of the tmp dir size and adjust it to our needs.
May you please track in the backlog the possibility to control the /tmp size as a separate partition (that is also a storage best practice)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, added a comment
Description of changes
Tests
AMIs build are successful. All integration tests in develop.yaml have been run.
Only the following 6 tests are failing:
References
Checklist
develop
add the branch name as prefix in the PR title (e.g.[release-3.6]
).Please review the guidelines for contributing and Pull Request Instructions.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.