Replication resync #6261

BiniKP · 2025-02-05T06:43:45Z

Version
{
"component": "core",
"version": "3.69.2",
"package": "pulpcore",
"module": "pulpcore.app",
"domain_compatible": true
}
{
"component": "ansible",
"version": "0.23.1",
"package": "pulp-ansible",
"module": "pulp_ansible.app",
"domain_compatible": false
}
{
"component": "container",
"version": "2.22.1",
"package": "pulp-container",
"module": "pulp_container.app",
"domain_compatible": false
}
{
"component": "deb",
"version": "3.5.0",
"package": "pulp_deb",
"module": "pulp_deb.app",
"domain_compatible": false
}
{
"component": "certguard",
"version": "3.69.2",
"package": "pulpcore",
"module": "pulp_certguard.app",
"domain_compatible": true
}
K8S installation with pulp-operator.

Describe the bug
During a replication, the last task got stuck, and I had to cancel it after several hours. But now, it is not trying to replicate that repository again, even though I deleted the repository tree created for it.

To Reproduce
Create a new pulp deployment, create an upstream-pulp (pulp client or api), run replica (pulp client or api). If anything fail, you'll not be able to force a new replica to download again the content.

Expected behavior
To have an option to force the download again from the same source.

Additional context
Even the pulp rpm sync using the repository and the remote created didn't work. It seems to download something but is not published at the end like the rest of the content.

BiniKP · 2025-02-05T16:40:49Z

Update:
If you delete the upstream-pulp through the API, remove all repositories, remotes, and distributions created by the replica, and run "pulp orphan cleanup" and "pulp repository reclaim --all" to clean up any remaining garbage, you will be able to create a new upstream-pulp and download everything successfully. (Additionally, I ran "pulp rpm prune-packages --all-repositories", but I am not sure if it had any effect on the solution.)

The problem, as you can imagine, is that I had to delete everything on the Pulp instance.

Also, there were no repositories from other plugins, but they will probably be affected if you apply this workaround in your environment.

mdellweg · 2025-02-06T10:00:06Z

Is there any chance you had some logging output of the failed sync and the failed reattempts? Is is even reproducibe?

BiniKP · 2025-02-06T11:33:25Z

Sorry, the logs of the kubernetes worker that ran the task are already rotated. The only thing I have is the record of the task in pulp:

 {
    "pulp_href": "/pulp/api/v3/task-groups/0194d0b3-3206-71af-9eae-1934b450d796/",
    "prn": "prn:core.taskgroup:0194d0b3-3206-71af-9eae-1934b450d796",
    "description": "Replication of pulp",
    "all_tasks_dispatched": true,
    "waiting": 0,
    "skipped": 0,
    "running": 0,
    "completed": 14,
    "canceled": 1,
    "failed": 0,
    "canceling": 0,
    "group_progress_reports": [],
    "tasks": [
      {
        "pulp_href": "/pulp/api/v3/tasks/0194d0b3-6170-71cf-96ee-3635e79475ff/",
        "prn": "prn:core.task:0194d0b3-6170-71cf-96ee-3635e79475ff",
        "pulp_created": "2025-02-04T11:23:24.401105Z",
        "pulp_last_updated": "2025-02-04T11:23:24.401121Z",
        "name": "pulp_rpm.app.tasks.synchronizing.synchronize",
        "state": "completed",
        "unblocked_at": "2025-02-04T11:23:22.655800Z",
        "started_at": "2025-02-04T11:23:22.743902Z",
        "finished_at": "2025-02-04T11:25:49.629534Z",
        "worker": "/pulp/api/v3/workers/0194d09a-1ade-7799-ad21-661303e355d4/"
      },
      {
        "pulp_href": "/pulp/api/v3/tasks/0194d0b3-611f-7e52-acc9-7ecdf97f18c2/",
        "prn": "prn:core.task:0194d0b3-611f-7e52-acc9-7ecdf97f18c2",
        "pulp_created": "2025-02-04T11:23:24.320744Z",
        "pulp_last_updated": "2025-02-04T11:23:24.320759Z",
        "name": "pulp_rpm.app.tasks.synchronizing.synchronize",
        "state": "completed",
        "unblocked_at": "2025-02-04T11:23:22.654156Z",
        "started_at": "2025-02-04T11:23:25.411986Z",
        "finished_at": "2025-02-04T11:25:54.884738Z",
        "worker": "/pulp/api/v3/workers/0194d099-da58-717f-8f3f-c51c3cca96da/"
      },
      {
        "pulp_href": "/pulp/api/v3/tasks/0194d0b3-5e7f-7386-9794-c3c9f3e19a3a/",
        "prn": "prn:core.task:0194d0b3-5e7f-7386-9794-c3c9f3e19a3a",
        "pulp_created": "2025-02-04T11:23:23.648596Z",
        "pulp_last_updated": "2025-02-04T11:23:23.648611Z",
        "name": "pulp_rpm.app.tasks.synchronizing.synchronize",
        "state": "completed",
        "unblocked_at": "2025-02-04T11:23:22.652379Z",
        "started_at": "2025-02-04T11:23:23.882758Z",
        "finished_at": "2025-02-04T11:47:49.163193Z",
        "worker": "/pulp/api/v3/workers/0194d09a-6fc3-76aa-9dfd-2ec6dc2b831d/"
      },
      {
        "pulp_href": "/pulp/api/v3/tasks/0194d0b3-61d8-73a9-9a26-997ef7405351/",
        "prn": "prn:core.task:0194d0b3-61d8-73a9-9a26-997ef7405351",
        "pulp_created": "2025-02-04T11:23:24.505119Z",
        "pulp_last_updated": "2025-02-04T11:23:24.505135Z",
        "name": "pulpcore.app.tasks.base.general_create",
        "state": "completed",
        "unblocked_at": "2025-02-04T12:08:59.429739Z",
        "started_at": "2025-02-04T12:08:58.346659Z",
        "finished_at": "2025-02-04T12:08:58.761941Z",
        "worker": "/pulp/api/v3/workers/0194d09a-1ade-7799-ad21-661303e355d4/"
      },
      {
        "pulp_href": "/pulp/api/v3/tasks/0194d0b3-6235-7d4c-8f39-bf65ff5ae50e/",
        "prn": "prn:core.task:0194d0b3-6235-7d4c-8f39-bf65ff5ae50e",
        "pulp_created": "2025-02-04T11:23:24.598421Z",
        "pulp_last_updated": "2025-02-04T11:23:24.598437Z",
        "name": "pulpcore.app.tasks.base.general_create",
        "state": "completed",
        "unblocked_at": "2025-02-04T14:58:46.913330Z",
        "started_at": "2025-02-04T14:58:45.782934Z",
        "finished_at": "2025-02-04T14:58:46.180092Z",
        "worker": "/pulp/api/v3/workers/0194d09a-1ade-7799-ad21-661303e355d4/"
      },
      {
        "pulp_href": "/pulp/api/v3/tasks/0194d0b3-621b-70cf-8039-2641d223b52f/",
        "prn": "prn:core.task:0194d0b3-621b-70cf-8039-2641d223b52f",
        "pulp_created": "2025-02-04T11:23:24.572529Z",
        "pulp_last_updated": "2025-02-04T11:23:24.572552Z",
        "name": "pulp_rpm.app.tasks.synchronizing.synchronize",
        "state": "canceled",
        "unblocked_at": "2025-02-04T11:23:22.659306Z",
        "started_at": "2025-02-04T11:25:55.090467Z",
        "finished_at": "2025-02-04T14:58:52.418323Z",
        "worker": "/pulp/api/v3/workers/0194d099-da58-717f-8f3f-c51c3cca96da/"
      },
      {
        "pulp_href": "/pulp/api/v3/tasks/0194d0b3-60e4-7436-8d7b-d53d70e9e6bf/",
        "prn": "prn:core.task:0194d0b3-60e4-7436-8d7b-d53d70e9e6bf",
        "pulp_created": "2025-02-04T11:23:24.261392Z",
        "pulp_last_updated": "2025-02-04T11:23:24.261414Z",
        "name": "pulpcore.app.tasks.base.general_create",
        "state": "completed",
        "unblocked_at": "2025-02-04T11:47:50.781742Z",
        "started_at": "2025-02-04T11:54:15.199788Z",
        "finished_at": "2025-02-04T11:54:15.712694Z",
        "worker": "/pulp/api/v3/workers/0194d09a-6fc3-76aa-9dfd-2ec6dc2b831d/"
      },
      {
        "pulp_href": "/pulp/api/v3/tasks/0194d0b3-320c-7c93-9998-9e69fe0df80a/",
        "prn": "prn:core.task:0194d0b3-320c-7c93-9998-9e69fe0df80a",
        "pulp_created": "2025-02-04T11:23:12.269459Z",
        "pulp_last_updated": "2025-02-04T11:23:12.269473Z",
        "name": "pulpcore.app.tasks.replica.replicate_distributions",
        "state": "completed",
        "unblocked_at": "2025-02-04T11:23:12.292854Z",
        "started_at": "2025-02-04T11:23:13.898680Z",
        "finished_at": "2025-02-04T11:23:25.291514Z",
        "worker": "/pulp/api/v3/workers/0194d099-da58-717f-8f3f-c51c3cca96da/"
      },
      {
        "pulp_href": "/pulp/api/v3/tasks/0194d0b3-6136-784a-9f47-9d98140d57a5/",
        "prn": "prn:core.task:0194d0b3-6136-784a-9f47-9d98140d57a5",
        "pulp_created": "2025-02-04T11:23:24.343720Z",
        "pulp_last_updated": "2025-02-04T11:23:24.343735Z",
        "name": "pulpcore.app.tasks.base.general_create",
        "state": "completed",
        "unblocked_at": "2025-02-04T11:54:17.258236Z",
        "started_at": "2025-02-04T12:02:19.178708Z",
        "finished_at": "2025-02-04T12:02:19.585775Z",
        "worker": "/pulp/api/v3/workers/0194d09a-6fc3-76aa-9dfd-2ec6dc2b831d/"
      },
      {
        "pulp_href": "/pulp/api/v3/tasks/0194d0b3-6288-7afc-8d65-3f426d53f1af/",
        "prn": "prn:core.task:0194d0b3-6288-7afc-8d65-3f426d53f1af",
        "pulp_created": "2025-02-04T11:23:24.681764Z",
        "pulp_last_updated": "2025-02-04T11:23:24.681778Z",
        "name": "pulpcore.app.tasks.base.general_create",
        "state": "completed",
        "unblocked_at": "2025-02-04T14:58:47.411701Z",
        "started_at": "2025-02-04T14:58:47.497664Z",
        "finished_at": "2025-02-04T14:58:47.889396Z",
        "worker": "/pulp/api/v3/workers/0194d09a-6fc3-76aa-9dfd-2ec6dc2b831d/"
      },
      {
        "pulp_href": "/pulp/api/v3/tasks/0194d0b3-62d4-7025-8cc8-1b81721235df/",
        "prn": "prn:core.task:0194d0b3-62d4-7025-8cc8-1b81721235df",
        "pulp_created": "2025-02-04T11:23:24.757572Z",
        "pulp_last_updated": "2025-02-04T11:23:24.757587Z",
        "name": "pulpcore.app.tasks.base.general_create",
        "state": "completed",
        "unblocked_at": "2025-02-04T14:58:46.702105Z",
        "started_at": "2025-02-04T14:58:47.986408Z",
        "finished_at": "2025-02-04T14:58:48.371997Z",
        "worker": "/pulp/api/v3/workers/0194d09a-6fc3-76aa-9dfd-2ec6dc2b831d/"
      },
      {
        "pulp_href": "/pulp/api/v3/tasks/0194d0b3-6273-7c5b-a931-e94d44979026/",
        "prn": "prn:core.task:0194d0b3-6273-7c5b-a931-e94d44979026",
        "pulp_created": "2025-02-04T11:23:24.660079Z",
        "pulp_last_updated": "2025-02-04T11:23:24.660094Z",
        "name": "pulp_rpm.app.tasks.synchronizing.synchronize",
        "state": "completed",
        "unblocked_at": "2025-02-04T11:23:22.660925Z",
        "started_at": "2025-02-04T11:47:49.341087Z",
        "finished_at": "2025-02-04T11:54:15.013271Z",
        "worker": "/pulp/api/v3/workers/0194d09a-6fc3-76aa-9dfd-2ec6dc2b831d/"
      },
      {
        "pulp_href": "/pulp/api/v3/tasks/0194d0b3-62be-7958-a6ca-0729fca0ec64/",
        "prn": "prn:core.task:0194d0b3-62be-7958-a6ca-0729fca0ec64",
        "pulp_created": "2025-02-04T11:23:24.735466Z",
        "pulp_last_updated": "2025-02-04T11:23:24.735481Z",
        "name": "pulp_rpm.app.tasks.synchronizing.synchronize",
        "state": "completed",
        "unblocked_at": "2025-02-04T11:23:22.662582Z",
        "started_at": "2025-02-04T11:54:15.834233Z",
        "finished_at": "2025-02-04T12:02:19.040620Z",
        "worker": "/pulp/api/v3/workers/0194d09a-6fc3-76aa-9dfd-2ec6dc2b831d/"
      },
      {
        "pulp_href": "/pulp/api/v3/tasks/0194d0b3-61c0-79e5-a533-ac635332c900/",
        "prn": "prn:core.task:0194d0b3-61c0-79e5-a533-ac635332c900",
        "pulp_created": "2025-02-04T11:23:24.481740Z",
        "pulp_last_updated": "2025-02-04T11:23:24.481755Z",
        "name": "pulp_rpm.app.tasks.synchronizing.synchronize",
        "state": "completed",
        "unblocked_at": "2025-02-04T11:23:22.657394Z",
        "started_at": "2025-02-04T11:25:49.766340Z",
        "finished_at": "2025-02-04T12:08:58.141099Z",
        "worker": "/pulp/api/v3/workers/0194d09a-1ade-7799-ad21-661303e355d4/"
      },
      {
        "pulp_href": "/pulp/api/v3/tasks/0194d0b3-6187-78de-9ede-3a64af197145/",
        "prn": "prn:core.task:0194d0b3-6187-78de-9ede-3a64af197145",
        "pulp_created": "2025-02-04T11:23:24.424153Z",
        "pulp_last_updated": "2025-02-04T11:23:24.424168Z",
        "name": "pulpcore.app.tasks.base.general_create",
        "state": "completed",
        "unblocked_at": "2025-02-04T12:02:18.443609Z",
        "started_at": "2025-02-04T12:02:19.688538Z",
        "finished_at": "2025-02-04T12:02:20.097750Z",
        "worker": "/pulp/api/v3/workers/0194d09a-6fc3-76aa-9dfd-2ec6dc2b831d/"
      }
    ]
  },

The task cancelled is this one:

{
    "pulp_href": "/pulp/api/v3/tasks/0194d0b3-621b-70cf-8039-2641d223b52f/",
    "prn": "prn:core.task:0194d0b3-621b-70cf-8039-2641d223b52f",
    "pulp_created": "2025-02-04T11:23:24.572529Z",
    "pulp_last_updated": "2025-02-04T11:23:24.572552Z",
    "state": "canceled",
    "name": "pulp_rpm.app.tasks.synchronizing.synchronize",
    "logging_cid": "2ac4fd85150a40eaa930c5bcc367067f",
    "created_by": "/pulp/api/v3/users/1/",
    "unblocked_at": "2025-02-04T11:23:22.659306Z",
    "started_at": "2025-02-04T11:25:55.090467Z",
    "finished_at": "2025-02-04T14:58:52.418323Z",
    "error": null,
    "worker": "/pulp/api/v3/workers/0194d099-da58-717f-8f3f-c51c3cca96da/",
    "parent_task": "/pulp/api/v3/tasks/0194d0b3-320c-7c93-9998-9e69fe0df80a/",
    "child_tasks": [],
    "task_group": "/pulp/api/v3/task-groups/0194d0b3-3206-71af-9eae-1934b450d796/",
    "progress_reports": [
      {
        "message": "Downloading Metadata Files",
        "code": "sync.downloading.metadata",
        "state": "completed",
        "total": null,
        "done": 6,
        "suffix": null
      },
      {
        "message": "Skipping Packages",
        "code": "sync.skipped.packages",
        "state": "completed",
        "total": 0,
        "done": 0,
        "suffix": null
      },
      {
        "message": "Parsed Packages",
        "code": "sync.parsing.packages",
        "state": "completed",
        "total": 13787,
        "done": 13787,
        "suffix": null
      },
      {
        "message": "Parsed Comps",
        "code": "sync.parsing.comps",
        "state": "completed",
        "total": 41,
        "done": 41,
        "suffix": null
      },
      {
        "message": "Parsed Advisories",
        "code": "sync.parsing.advisories",
        "state": "completed",
        "total": 4808,
        "done": 4808,
        "suffix": null
      },
      {
        "message": "Associating Content",
        "code": "associating.content",
        "state": "running",
        "total": null,
        "done": 17502,
        "suffix": null
      },
      {
        "message": "Downloading Artifacts",
        "code": "sync.downloading.artifacts",
        "state": "running",
        "total": null,
        "done": 13786,
        "suffix": null
      }
    ],
    "created_resources": [],
    "reserved_resources_record": [
      "prn:rpm.rpmrepository:0194d0b3-61ff-7a9d-93ef-1d1d425b61c4",
      "shared:prn:rpm.rpmremote:0194d0b3-61f0-7b9d-a517-95cbe3ddb5c2",
      "shared:prn:core.domain:0194d06f-3c81-7b10-b04f-288c85446af5"
    ]
  }
]

As you can see, the task was running and, for near 2 hours, was stuck in this state.
The following replica request was this one:

 {
    "pulp_href": "/pulp/api/v3/task-groups/0194d186-5879-7a79-8990-8fb3953007e0/",
    "prn": "prn:core.taskgroup:0194d186-5879-7a79-8990-8fb3953007e0",
    "description": "Replication of pulp",
    "all_tasks_dispatched": true,
    "waiting": 0,
    "skipped": 0,
    "running": 0,
    "completed": 1,
    "canceled": 0,
    "failed": 0,
    "canceling": 0,
    "group_progress_reports": [],
    "tasks": [
      {
        "pulp_href": "/pulp/api/v3/tasks/0194d186-587f-7d54-8947-b74f642288fd/",
        "prn": "prn:core.task:0194d186-587f-7d54-8947-b74f642288fd",
        "pulp_created": "2025-02-04T15:13:50.208566Z",
        "pulp_last_updated": "2025-02-04T15:13:50.208581Z",
        "name": "pulpcore.app.tasks.replica.replicate_distributions",
        "state": "completed",
        "unblocked_at": "2025-02-04T15:13:50.232040Z",
        "started_at": "2025-02-04T15:13:49.097432Z",
        "finished_at": "2025-02-04T15:13:52.786248Z",
        "worker": "/pulp/api/v3/workers/0194d09a-1ade-7799-ad21-661303e355d4/"
      }
    ]
  },

There is no other task in the task group and no errors on the logs as far as I can see.

I was able to reproduce the issue twice. The first one after the creation of the k8s cluster and the second one after destruction and redeployment. The third one, when it was able to replicate, was after manually delete everything in the pulp cluster as described in the main post.

Sorry for the long post and low details.

mdellweg · 2025-02-06T11:53:17Z

No worries. I fear however I cannot deduce more information from it.
Random thoughts: A sync task can take several hours depending on the amount of stuff needed to sync. It seems like the second attempt did not even start a sync task. Maybe there's a hole in the "no changes detected, skip update" logic.

mdellweg · 2025-02-06T18:27:59Z

OK, looking into this, I can confirm that the replica optimization logic fails here.
Either updating the upstream distribution, or recreating the UpstreamPulp object (in fact, clearing the last_replication field should suffice) should allow for resynching.

fixes pulp#6261

BiniKP · 2025-02-07T06:34:02Z

OK, looking into this, I can confirm that the replica optimization logic fails here. Either updating the upstream distribution, or recreating the UpstreamPulp object (in fact, clearing the last_replication field should suffice) should allow for resynching.

While investigating the functionality of UpstreamsPulp, I noticed that there is no "destroy" option for it in the Pulp client, but it does exist in the API. I unfortunately discovered the hard way that using it will orphan everything replicated on the server, and you will likely have to destroy the cluster if you want to continue. I'm not sure if this is a known issue or if it could affect your modifications.

fixes pulp#6261

mdellweg · 2025-02-07T09:37:18Z

Yes, makeing it an exact clone (deleting everything else in the domain) is part of the design. I just figured, i cannot find any documentation either.
Also there is work going on improving the deletion options: #6247

BiniKP added Issue Triage-Needed labels Feb 5, 2025

mdellweg added a commit to mdellweg/pulpcore that referenced this issue Feb 6, 2025

Fix replication in case of previous failure

cdd9332

fixes pulp#6261

mdellweg linked a pull request Feb 6, 2025 that will close this issue

Fix replication in case of previous failure #6266

Open

mdellweg added a commit to mdellweg/pulpcore that referenced this issue Feb 6, 2025

Fix replication in case of previous failure

244424c

fixes pulp#6261

mdellweg added a commit to mdellweg/pulpcore that referenced this issue Feb 6, 2025

Fix replication in case of previous failure

92d849e

fixes pulp#6261

mdellweg added a commit to mdellweg/pulpcore that referenced this issue Feb 7, 2025

Fix replication in case of previous failure

455d298

fixes pulp#6261

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replication resync #6261

Replication resync #6261

BiniKP commented Feb 5, 2025

BiniKP commented Feb 5, 2025

mdellweg commented Feb 6, 2025

BiniKP commented Feb 6, 2025 •

edited

Loading

mdellweg commented Feb 6, 2025

mdellweg commented Feb 6, 2025

BiniKP commented Feb 7, 2025 •

edited

Loading

mdellweg commented Feb 7, 2025

Replication resync #6261

Replication resync #6261

Comments

BiniKP commented Feb 5, 2025

BiniKP commented Feb 5, 2025

mdellweg commented Feb 6, 2025

BiniKP commented Feb 6, 2025 • edited Loading

mdellweg commented Feb 6, 2025

mdellweg commented Feb 6, 2025

BiniKP commented Feb 7, 2025 • edited Loading

mdellweg commented Feb 7, 2025

BiniKP commented Feb 6, 2025 •

edited

Loading

BiniKP commented Feb 7, 2025 •

edited

Loading