-
Couldn't load subscription status.
- Fork 27
Add Troubleshooting Guide and Improve Installer Script Variable Validation #234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: devel
Are you sure you want to change the base?
Add Troubleshooting Guide and Improve Installer Script Variable Validation #234
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: taliandre49 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@taliandre49: PR is not mergeable. The PR state is: blocked Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
Welcome @taliandre49! It looks like this is your first PR to ocp-power-automation/openshift-install-power 🎉 |
| variable "rhel_image_name" { | ||
| default = "rhel-8.9" | ||
| } | ||
|
|
||
| variable "rhcos_image_name" { | ||
| default = "rhcos-4.15" | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| variable "rhel_image_name" { | |
| default = "rhel-8.9" | |
| } | |
| variable "rhcos_image_name" { | |
| default = "rhcos-4.15" | |
| } | |
| variable "rhel_image_name" { | |
| default = "rhel-9.6" | |
| } | |
| variable "rhcos_image_name" { | |
| default = "rhcos-4.19" | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestions! From what I saw, the wrapper defaults the RHCOS image to 4.15:
RELEASE_VER OpenShift release version (Default: 4.15)
RELEASE_VER=${RELEASE_VER:-"4.15"}
Just wanted to confirm if the plan is to bump the default version here as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah - I'd recommend changing it around. Maybe grab 30 minutes with me on Monday or Tuesday and we can go through this together? Then we can work with @yussufsh to review?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds great, thanks. I’ll grab some time on your calendar!
|
|
||
| terraform state rm <resource-name> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| terraform state rm <resource-name> | |
| terraform state rm <resource-name> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should recommend this to customers.
docs/troubleShooting.md
Outdated
|
|
||
| terraform taint module.nodes.ibm_pi_instance.worker[0] | ||
| terraform apply |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| terraform taint module.nodes.ibm_pi_instance.worker[0] | |
| terraform apply | |
| terraform taint module.nodes.ibm_pi_instance.worker[0] | |
| terraform apply |
I am 99% sure you don't get the labels added.
| Incorrect Storage Type (e.g. "nfs" not recognized) | ||
|
|
||
| Error: "pi_volume_type" must contain a value from ["ssd", "standard", "tier1", "tier3"], got "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you share where you hit this? it's probably a bug in the tfvars
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line occurred on the pi_volume_type for the bastion storage type please see code bellow for full error:
Error: "pi_volume_type" must contain a value from []string{"ssd", "standard", "tier1", "tier3"}, got ""
│ with module.prepare.ibm_pi_volume.volume[0],
│ on modules/1_prepare/prepare.tf line 87, in resource "ibm_pi_volume" "volume":
│ 87: pi_volume_type = local.bastion_storage_type
docs/troubleShooting.md
Outdated
| On a subsequent UPI install attempt, Terraform tries to create a network with the same name that already exists. | ||
| PowerVS does not allow duplicate network names—even if the old network is inactive. | ||
|
|
||
| **Resolution:** | ||
|
|
||
| - Log into your PowerVS workspace. | ||
|
|
||
| - Delete or rename the existing ocp-net network or subnet. | ||
|
|
||
| - Re-run the installer: | ||
| ```bash | ||
| terraform apply ./openshift-install-powervs create | ||
| ``` | ||
|
|
||
| ⚠️ Renaming networks automatically is not recommended—it can lead to subnet sprawl and degraded performance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| On a subsequent UPI install attempt, Terraform tries to create a network with the same name that already exists. | |
| PowerVS does not allow duplicate network names—even if the old network is inactive. | |
| **Resolution:** | |
| - Log into your PowerVS workspace. | |
| - Delete or rename the existing ocp-net network or subnet. | |
| - Re-run the installer: | |
| ```bash | |
| terraform apply ./openshift-install-powervs create | |
| ``` | |
| ⚠️ Renaming networks automatically is not recommended—it can lead to subnet sprawl and degraded performance. | |
| On a subsequent UPI install attempt, Terraform tries to create a network with the same name that already exists. | |
| PowerVS does not allow duplicate network names—even if the old network is inactive. | |
| **Resolution:** | |
| - Log into your PowerVS workspace. | |
| - Delete or rename the existing ocp-net network or subnet. | |
| - Re-run the installer: | |
| ```bash | |
| terraform apply ./openshift-install-powervs create |
This one is kind of strange. there are multiple question marks for me on this one.
| if [[ "$ACTION" != "help" ]]; then | ||
| check_required_env_vars | ||
| fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO this function is not required, precheck_input function already exists. (also almost all ENV vars have default values + appropriate guiding doc instructions are available)
| #------------------------------------------------------------------------- | ||
| # Check for required environment variables and display helpful information | ||
| #------------------------------------------------------------------------- | ||
| function check_required_env_vars { | ||
| missing_vars=0 | ||
|
|
||
| log "Checking required environment variables..." | ||
|
|
||
| # Check IBMCLOUD_API_KEY | ||
| if [[ -z "${IBMCLOUD_API_KEY}" ]]; then | ||
| warn "IBMCLOUD_API_KEY is not set" | ||
| echo " Description: IBM Cloud API key for authentication" | ||
| echo " How to set: export IBMCLOUD_API_KEY='your-api-key-here'" | ||
| echo "" | ||
| missing_vars=1 | ||
| fi | ||
|
|
||
| # Check RELEASE_VER (optional since we have default) | ||
| if [[ -z "${RELEASE_VER}" ]]; then | ||
| warn "RELEASE_VER is not set (will use default: 4.15 type 4.15 if you want to use defualt elsee export correct rhcos version)" | ||
| echo " Description: OpenShift release version to install" | ||
| echo " Default: " | ||
| echo " How to set: export RELEASE_VER='4.16'" | ||
| echo "" | ||
| else | ||
| log "Using RHCOS release version: ${RELEASE_VER}, to change run export RELEASE_VER='<version>'" | ||
| fi | ||
|
|
||
| # Check RHEL_SUBS_PASSWORD (optional) | ||
| if [[ -z "${RHEL_SUBS_PASSWORD}" ]]; then | ||
| warn "RHEL_SUBS_PASSWORD is not set" | ||
| echo " Description: RHEL subscription password for bastion nodes" | ||
| echo " Note: You can provide this during the 'variables' prompt or set it now" | ||
| echo " How to set: export RHEL_SUBS_PASSWORD='your-password-here'" | ||
| echo "" | ||
| fi | ||
|
|
||
| # Check NO_OF_RETRY (optional) | ||
| if [[ -z "${NO_OF_RETRY}" ]]; then | ||
| log "NO_OF_RETRY not set (using default: 5)" | ||
| else | ||
| log "Using retry count: ${NO_OF_RETRY}" | ||
| fi | ||
|
|
||
| # Check ARTIFACTS_VERSION (optional) | ||
| if [[ -z "${ARTIFACTS_VERSION}" ]]; then | ||
| log "ARTIFACTS_VERSION not set (using default: main)" | ||
| else | ||
| log "Using artifacts version: ${ARTIFACTS_VERSION}" | ||
| fi | ||
|
|
||
| echo "" | ||
|
|
||
| if [[ $missing_vars -eq 1 ]]; then | ||
| error "Required environment variables are missing. Please set them and try again." | ||
| fi | ||
|
|
||
| success "Environment variable check completed" | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as above.
This PR enhances the installer script to improve usability and reduce common setup issues.
Key Updates:
Testing:
Context:
These changes address issues observed when users and/or customers encounter missing or outdated image references during installation. The improved checks and messages help users identify configuration gaps earlier and adjust environment variables accordingly.