Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix recent radosgw CI/CD pipeline failures #122

Merged
merged 1 commit into from
Nov 7, 2023
Merged

Conversation

jtriley
Copy link
Contributor

@jtriley jtriley commented Nov 6, 2023

This fixes an issue with the radosgw portion of the CI/CD pipeline due to an upstream issue/change with ceph-exporter related to the metrics setup on the cluster. This patch simply disables setting up the monitoring stack altogether given that the metrics setup is not used by this pipeline.

This patch also doubles the size of the OSDs given that the most recent version of ceph deployed by cephadm 17.2.6-0ubuntu0.22.04.1 requires more space. This was determined by the appearance of out-of-space errors
in OSD logs on an external test VM.

In addition to these changes, this patch also disables configuring firewalld given that it's not installed and not necessary for the CI/CD pipeline.

@jtriley
Copy link
Contributor Author

jtriley commented Nov 7, 2023

Testing this locally in a ubuntu 22.04 VM is showing the rgw container failing with:

root@jtr-rgw:~/coldfront-plugin-cloud/ci# docker logs 7fec4bed150a -f
debug 2023-11-07T15:52:21.773+0000 7fb05b0e1740  0 deferred set uid:gid to 167:167 (ceph:ceph)
debug 2023-11-07T15:52:21.773+0000 7fb05b0e1740  0 ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable), process radosgw, pid 7
debug 2023-11-07T15:52:21.773+0000 7fb05b0e1740  0 framework: beast
debug 2023-11-07T15:52:21.773+0000 7fb05b0e1740  0 framework conf key: port, val: 80
debug 2023-11-07T15:52:21.773+0000 7fb05b0e1740  1 radosgw_Main not setting numa affinity
debug 2023-11-07T15:52:21.777+0000 7fb05b0e1740  1 rgw_d3n: rgw_d3n_l1_local_datacache_enabled=0
debug 2023-11-07T15:52:21.777+0000 7fb05b0e1740  1 D3N datacache enabled: 0
debug 2023-11-07T15:57:21.775+0000 7fb059c64700 -1 Initialization timeout, failed to initialize

Still looking into it.

@jtriley
Copy link
Contributor Author

jtriley commented Nov 7, 2023

Tracked this down to an issue with OSDs not successfully joining the cluster due to out of space issue during initialization. Trying to double the size of the OSDs (from 500M to 1000M) to see if that helps.

This fixes an issue with the radosgw portion of the CI/CD pipeline due
to an upstream issue/change with ceph-exporter related to the metrics
setup on the cluster. This patch simply disables setting up the
monitoring stack altogether given that the metrics setup is not used by
this pipeline.

This patch also doubles the size of the OSDs given that the most recent
version of ceph deployed by cephadm 17.2.6-0ubuntu0.22.04.1 requires
more space. This was determined by the appearance of out-of-space errors
in OSD logs on an external test VM.

In addition to these changes, this patch also disables configuring
firewalld given that it's not installed and not necessary for the CI/CD
pipeline.
@jtriley jtriley changed the title skip dashboard and monitoring in radosgw CI/CD fix recent radosgw CI/CD pipeline failures Nov 7, 2023
@knikolla knikolla merged commit 16f0d9d into main Nov 7, 2023
4 checks passed
@knikolla knikolla deleted the fix-radosgw-ci-cd branch November 7, 2023 17:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants