Skip to content

Conversation

@taliandre49
Copy link

This PR enhances the installer script to improve usability and reduce common setup issues.

Key Updates:

  • Added clearer validation and guidance for required environment variables (IBMCLOUD_API_KEY, RELEASE_VER, etc.)
  • Introduced detailed echo messages explaining defaults and how to override them (e.g., RHCOS version).
  • Clarified the default RHCOS image version (8.3) and provided explicit guidance on changing image versions.
  • New troubleshooting.md under docs/
    • documents common OpenShift on PowerVS installation issues, their causes, and step-by-step resolutions.
    • Covers Terraform state issues, Bastion OS compatibility, reinstallation conflicts, remote-exec failures, LPAR health issues, and missing image errors.
    • References verified solutions and relevant IBM/PowerVS documentation for easier user debugging.

Testing:

  • Verified successful execution by running the installer script end-to-end after these updates.

Context:

These changes address issues observed when users and/or customers encounter missing or outdated image references during installation. The improved checks and messages help users identify configuration gaps earlier and adjust environment variables accordingly.

@ppc64le-cloud-bot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: taliandre49
Once this PR has been reviewed and has the lgtm label, please assign yussufsh for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ppc64le-cloud-bot
Copy link

ppc64le-cloud-bot commented Oct 15, 2025

@taliandre49: PR is not mergeable.

The PR state is: blocked

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ppc64le-cloud-bot
Copy link

Welcome @taliandre49! It looks like this is your first PR to ocp-power-automation/openshift-install-power 🎉

Comment on lines 178 to 184
variable "rhel_image_name" {
default = "rhel-8.9"
}

variable "rhcos_image_name" {
default = "rhcos-4.15"
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
variable "rhel_image_name" {
default = "rhel-8.9"
}
variable "rhcos_image_name" {
default = "rhcos-4.15"
}
variable "rhel_image_name" {
default = "rhel-9.6"
}
variable "rhcos_image_name" {
default = "rhcos-4.19"
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestions! From what I saw, the wrapper defaults the RHCOS image to 4.15:

RELEASE_VER OpenShift release version (Default: 4.15)

RELEASE_VER=${RELEASE_VER:-"4.15"}

Just wanted to confirm if the plan is to bump the default version here as well?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - I'd recommend changing it around. Maybe grab 30 minutes with me on Monday or Tuesday and we can go through this together? Then we can work with @yussufsh to review?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds great, thanks. I’ll grab some time on your calendar!

Comment on lines +29 to +30

terraform state rm <resource-name>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
terraform state rm <resource-name>
terraform state rm <resource-name>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should recommend this to customers.

Comment on lines 42 to 44

terraform taint module.nodes.ibm_pi_instance.worker[0]
terraform apply
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
terraform taint module.nodes.ibm_pi_instance.worker[0]
terraform apply
terraform taint module.nodes.ibm_pi_instance.worker[0]
terraform apply

I am 99% sure you don't get the labels added.

Comment on lines +66 to +68
Incorrect Storage Type (e.g. "nfs" not recognized)

Error: "pi_volume_type" must contain a value from ["ssd", "standard", "tier1", "tier3"], got ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you share where you hit this? it's probably a bug in the tfvars

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line occurred on the pi_volume_type for the bastion storage type please see code bellow for full error:

Error: "pi_volume_type" must contain a value from []string{"ssd", "standard", "tier1", "tier3"}, got ""

│   with module.prepare.ibm_pi_volume.volume[0],

│   on modules/1_prepare/prepare.tf line 87, in resource "ibm_pi_volume" "volume":

│   87:   pi_volume_type       = local.bastion_storage_type

Comment on lines 87 to 101
On a subsequent UPI install attempt, Terraform tries to create a network with the same name that already exists.
PowerVS does not allow duplicate network names—even if the old network is inactive.

**Resolution:**

- Log into your PowerVS workspace.

- Delete or rename the existing ocp-net network or subnet.

- Re-run the installer:
```bash
terraform apply ./openshift-install-powervs create
```

⚠️ Renaming networks automatically is not recommended—it can lead to subnet sprawl and degraded performance.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
On a subsequent UPI install attempt, Terraform tries to create a network with the same name that already exists.
PowerVS does not allow duplicate network names—even if the old network is inactive.
**Resolution:**
- Log into your PowerVS workspace.
- Delete or rename the existing ocp-net network or subnet.
- Re-run the installer:
```bash
terraform apply ./openshift-install-powervs create
```
⚠️ Renaming networks automatically is not recommended—it can lead to subnet sprawl and degraded performance.
On a subsequent UPI install attempt, Terraform tries to create a network with the same name that already exists.
PowerVS does not allow duplicate network names—even if the old network is inactive.
**Resolution:**
- Log into your PowerVS workspace.
- Delete or rename the existing ocp-net network or subnet.
- Re-run the installer:
```bash
terraform apply ./openshift-install-powervs create

⚠️ Renaming networks automatically is not recommended—it can lead to subnet sprawl and degraded performance.


This one is kind of strange. there are multiple question marks for me on this one.

Comment on lines +1758 to +1760
if [[ "$ACTION" != "help" ]]; then
check_required_env_vars
fi
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this function is not required, precheck_input function already exists. (also almost all ENV vars have default values + appropriate guiding doc instructions are available)

Comment on lines +162 to +220
#-------------------------------------------------------------------------
# Check for required environment variables and display helpful information
#-------------------------------------------------------------------------
function check_required_env_vars {
missing_vars=0

log "Checking required environment variables..."

# Check IBMCLOUD_API_KEY
if [[ -z "${IBMCLOUD_API_KEY}" ]]; then
warn "IBMCLOUD_API_KEY is not set"
echo " Description: IBM Cloud API key for authentication"
echo " How to set: export IBMCLOUD_API_KEY='your-api-key-here'"
echo ""
missing_vars=1
fi

# Check RELEASE_VER (optional since we have default)
if [[ -z "${RELEASE_VER}" ]]; then
warn "RELEASE_VER is not set (will use default: 4.15 type 4.15 if you want to use defualt elsee export correct rhcos version)"
echo " Description: OpenShift release version to install"
echo " Default: "
echo " How to set: export RELEASE_VER='4.16'"
echo ""
else
log "Using RHCOS release version: ${RELEASE_VER}, to change run export RELEASE_VER='<version>'"
fi

# Check RHEL_SUBS_PASSWORD (optional)
if [[ -z "${RHEL_SUBS_PASSWORD}" ]]; then
warn "RHEL_SUBS_PASSWORD is not set"
echo " Description: RHEL subscription password for bastion nodes"
echo " Note: You can provide this during the 'variables' prompt or set it now"
echo " How to set: export RHEL_SUBS_PASSWORD='your-password-here'"
echo ""
fi

# Check NO_OF_RETRY (optional)
if [[ -z "${NO_OF_RETRY}" ]]; then
log "NO_OF_RETRY not set (using default: 5)"
else
log "Using retry count: ${NO_OF_RETRY}"
fi

# Check ARTIFACTS_VERSION (optional)
if [[ -z "${ARTIFACTS_VERSION}" ]]; then
log "ARTIFACTS_VERSION not set (using default: main)"
else
log "Using artifacts version: ${ARTIFACTS_VERSION}"
fi

echo ""

if [[ $missing_vars -eq 1 ]]; then
error "Required environment variables are missing. Please set them and try again."
fi

success "Environment variable check completed"
}
Copy link
Collaborator

@Prajyot-Parab Prajyot-Parab Oct 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants