Skip to content
This repository has been archived by the owner on Mar 31, 2022. It is now read-only.

Reaper status shows RUNNING, though the repair is successfully completed #131

Open
jaibheem opened this issue Dec 8, 2015 · 4 comments
Open

Comments

@jaibheem
Copy link

jaibheem commented Dec 8, 2015

Reaper status shows as RUNNING, though the repair is successfully completed.

Cassandra Version: 2.0.15

Couple of questions here

  1. I tried running repair multiple times for different keyspaces, the repair state still shows "RUNNING" for more than 48 hours though the repair is completed.
  2. The repair always runs only on the node "10.16.3.162", which is a seed node. How can we make sure repair continuous on next nodes once this is completed.
    FYI:
    seeds node: ./bin/spreaper add-cluster 10.16.3.162

[root@test cassandra-reaper]# ./bin/spreaper list-runs

Report improvements/bugs at https://github.com/spotify/cassandra-reaper/issues

------------------------------------------------------------------------------

Listing repair runs

Found 1 repair runs

[
{
"cause": "manual spreaper run",
"cluster_name": "test",
"column_families": [],
"creation_time": "2015-12-08T06:19:39Z",
"duration": null,
"end_time": null,
"estimated_time_of_arrival": "2016-04-07T11:54:22Z",
"id": 1,
"intensity": 0.900,
"keyspace_name": "cache",
"last_event": "Triggered repair of segment 1 via host 10.16.3.162",
"owner": "root",
"pause_time": null,
"repair_parallelism": "DATACENTER_AWARE",
"segments_repaired": 1,
"start_time": "2015-12-08T06:19:40Z",
"state": "RUNNING",
"total_segments": 301
}
]

Any help on this is appreciated.

@jaibheem
Copy link
Author

Any help on this please?

@jaibheem
Copy link
Author

jaibheem commented Nov 1, 2016

Any help on this?

@deniszh
Copy link

deniszh commented Nov 1, 2016

According to job status, your repair still running. You starting repair of segment 1 on host 10.16.3.162 at "2015-12-08T06:19:40Z", and it's running very slow (or stuck). It's only 1st segment, and you have another 300 to go, so, it predicts finish time at "2016-04-07T11:54:22Z" - i.e. almost after 5 months from start. But I suspect ETA is not accurate because it stuck and can't repair even single node.
Try to stop and start repair again. Check Cassandra logs on 10.16.3.162. Enable DEBUG and check reaper logs, what it's doing.

@jaibheem
Copy link
Author

jaibheem commented Nov 1, 2016

@deniszh Thanks for the response.

I tried it several times. It invokes the repair and completes. but the reaper status doesn't change to completed.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants