Skip to content

Commit

Permalink
Remove references to BFG Repo Cleaner (#53425)
Browse files Browse the repository at this point in the history
  • Loading branch information
newren authored Dec 9, 2024
1 parent 8af11f7 commit 4070c99
Show file tree
Hide file tree
Showing 6 changed files with 12 additions and 58 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,9 @@ shortTitle: Remove sensitive data

## About removing sensitive data from a repository

When altering your repository's history using tools like `git filter-repo` or the BFG Repo-Cleaner, it's crucial to understand the implications, especially regarding open pull requests and sensitive data.
When altering your repository's history using tools like `git filter-repo`, it's crucial to understand the implications, especially regarding open pull requests and sensitive data.

The `git filter-repo` tool and the BFG Repo-Cleaner rewrite your repository's history, which changes the SHAs for existing commits that you alter and any dependent commits. Changed commit SHAs may affect open pull requests in your repository. We recommend merging or closing all open pull requests before removing files from your repository.
The `git filter-repo` tool rewrites your repository's history, which changes the SHAs for existing commits that you alter and any dependent commits. Changed commit SHAs may affect open pull requests in your repository. We recommend merging or closing all open pull requests before removing files from your repository.

You can remove the file from the latest commit with `git rm`. For information on removing a file that was added with the latest commit, see "[AUTOTITLE](/repositories/working-with-files/managing-large-files/about-large-files-on-github#removing-files-from-a-repositorys-history)."

Expand All @@ -48,37 +48,7 @@ If the commit that introduced the sensitive data exists in any forks, it will co

Consider these limitations and challenges in your decision to rewrite your repository's history.

## Purging a file from your repository's history

You can purge a file from your repository's history using either the `git filter-repo` tool or the BFG Repo-Cleaner open source tool.

> [!NOTE] If sensitive data is located in a file that's identified as a binary file, you'll need to remove the file from the history, as you can't modify it to remove or replace the data.
### Using the BFG

The [BFG Repo-Cleaner](https://rtyley.github.io/bfg-repo-cleaner/) is a tool that's built and maintained by the open source community. It provides a faster, simpler alternative to `git filter-repo` for removing unwanted data.

For example, to remove your file with sensitive data and leave your latest commit untouched, run:

```shell
bfg --delete-files YOUR-FILE-WITH-SENSITIVE-DATA
```

To replace all text listed in `passwords.txt` wherever it can be found in your repository's history, run:

```shell
bfg --replace-text passwords.txt
```

After the sensitive data is removed, you must force push your changes to {% data variables.product.product_name %}. Force pushing rewrites the repository history, which removes sensitive data from the commit history. If you force push, it may overwrite commits that other people have based their work on.

```shell
git push --force
```

See the [BFG Repo-Cleaner](https://rtyley.github.io/bfg-repo-cleaner/)'s documentation for full usage and download instructions.

### Using git filter-repo
## Purging a file from your repository's history using git-filter-repo

> [!WARNING] If you run `git filter-repo` after stashing changes, you won't be able to retrieve your changes with other stash commands. Before running `git filter-repo`, we recommend unstashing any changes you've made. To unstash the last set of changes you've stashed, run `git stash show -p | git apply -R`. For more information, see [Git Tools - Stashing and Cleaning](https://git-scm.com/book/en/v2/Git-Tools-Stashing-and-Cleaning).
Expand Down Expand Up @@ -178,7 +148,7 @@ To illustrate how `git filter-repo` works, we'll show you how to remove your fil

## Fully removing the data from {% data variables.product.prodname_dotcom %}

After using either the BFG tool or `git filter-repo` to remove the sensitive data and pushing your changes to {% data variables.product.product_name %}, you must take a few more steps to fully remove the data from {% data variables.product.product_name %}.
After using `git filter-repo` to remove the sensitive data and pushing your changes to {% data variables.product.product_name %}, you must take a few more steps to fully remove the data from {% data variables.product.product_name %}.

{% ifversion ghec %}
1. If the repository was migrated using the {% data variables.product.prodname_importer_proper_name %}, there may be some non-standard Git references that follow the pattern `refs/github-services`, that neither the BFG tool or `git filter-repo` can remove. In this case, remove those references running the following commands in your local copy of the repository:
Expand All @@ -205,22 +175,6 @@ After using either the BFG tool or `git filter-repo` to remove the sensitive dat

1. Tell your collaborators to [rebase](https://git-scm.com/book/en/v2/Git-Branching-Rebasing), _not_ merge, any branches they created off of your old (tainted) repository history. One merge commit could reintroduce some or all of the tainted history that you just went to the trouble of purging.

1. If you used `git filter-repo`, you can skip this step.

If you used the BFG tool, after rewriting, you can clean up references in your local repository to the old history to be dereferenced and garbage collected with the following commands (using Git 1.8.5 or newer):

```shell
$ git reflog expire --expire=now --all
$ git gc --prune=now
> Counting objects: 2437, done.
> Delta compression using up to 4 threads.
> Compressing objects: 100% (1378/1378), done.
> Writing objects: 100% (2437/2437), done.
> Total 2437 (delta 1461), reused 1802 (delta 1048)
```

> [!NOTE] You can also achieve this by pushing your filtered history to a new or empty repository and then making a fresh clone from {% data variables.product.product_name %}.

{% ifversion ghes %}

## Identifying reachable commits
Expand All @@ -245,7 +199,7 @@ If references are found in any forks, the results will look similar, but will st
ghe-nwo NWO
```

The same procedure using the BFG tool or `git filter-repo` can be used to remove the sensitive data from the repository's forks. Alternatively, the forks can be deleted altogether, and if needed, the repository can be re-forked once the cleanup of the root repository is complete.
The same procedure using `git filter-repo` can be used to remove the sensitive data from the repository's forks. Alternatively, the forks can be deleted altogether, and if needed, the repository can be re-forked once the cleanup of the root repository is complete.
Once you have removed the commit's references, re-run the commands to double-check.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ To ensure that all code is properly reviewed prior to being merged into the defa

## Mitigate data leaks

If a user pushes sensitive data, ask them to remove it by using the `git filter-repo` tool or the BFG Repo-Cleaner open source tool. For more information, see "[AUTOTITLE](/authentication/keeping-your-account-and-data-secure/removing-sensitive-data-from-a-repository)." Also, it is possible to revert almost anything in Git. For more information, see [{% data variables.product.prodname_blog %}](https://github.blog/2015-06-08-how-to-undo-almost-anything-with-git/).
If a user pushes sensitive data, ask them to remove it by using the `git filter-repo` tool. For more information, see "[AUTOTITLE](/authentication/keeping-your-account-and-data-secure/removing-sensitive-data-from-a-repository)." Also, it is possible to revert almost anything in Git. For more information, see [{% data variables.product.prodname_blog %}](https://github.blog/2015-06-08-how-to-undo-almost-anything-with-git/).

At the organization level, if you're unable to coordinate with the user who pushed the sensitive data to remove it, we recommend you contact {% data variables.contact.contact_support %} with the concerning commit SHA.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ Below is a typical workflow that explains how {% data variables.product.prodname

* **Remediation:** You then need to take appropriate actions to remediate the exposure. This might include:
* Rotating the affected credential to ensure it is no longer usable.
* Removing the secret from the repository's history (using tools like BFG Repo-Cleaner or {% data variables.product.prodname_dotcom %}'s built-in features).
* Removing the secret from the repository's history (using tools like `git-filter-repo` or {% data variables.product.prodname_dotcom %}'s built-in features).

* **Monitoring:** It's good practice to regularly audit and monitor your repositories to ensure no other secrets are exposed.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ If the file was added with your most recent commit, and you have not pushed to {

### Removing a file that was added in an earlier commit

If you added a file in an earlier commit, you need to remove it from the repository's history. To remove files from the repository's history, you can use the BFG Repo-Cleaner or the `git filter-repo` command. For more information see "[AUTOTITLE](/authentication/keeping-your-account-and-data-secure/removing-sensitive-data-from-a-repository)."
If you added a file in an earlier commit, you need to remove it from the repository's history. To remove files from the repository's history, we recommend the `git filter-repo` command. For more information see "[AUTOTITLE](/authentication/keeping-your-account-and-data-secure/removing-sensitive-data-from-a-repository)."

## Distributing large binaries

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,9 @@ After installing {% data variables.large_files.product_name_short %} and configu
{% data reusables.large_files.resolving-upload-failures %}

> [!TIP]
> If you get an error that "this exceeds {% data variables.large_files.product_name_short %}'s file size limit of {% data variables.large_files.max_github_size %}" when you try to push files to Git, you can use `git lfs migrate` instead of `filter-repo` or the BFG Repo Cleaner, to move the large file to {% data variables.large_files.product_name_long %}. For more information about the `git lfs migrate` command, see the [Git LFS 2.2.0](https://github.com/blog/2384-git-lfs-2-2-0-released) release announcement.
> If you get an error that "this exceeds {% data variables.large_files.product_name_short %}'s file size limit of {% data variables.large_files.max_github_size %}" when you try to push files to Git, you can use `git lfs migrate` instead of `filter-repo`, to move the large file to {% data variables.large_files.product_name_long %}. For more information about the `git lfs migrate` command, see the [Git LFS 2.2.0](https://github.com/blog/2384-git-lfs-2-2-0-released) release announcement.
1. Remove the file from the repository's Git history using either the `filter-repo` command or BFG Repo-Cleaner. For detailed information on using these, see "[AUTOTITLE](/authentication/keeping-your-account-and-data-secure/removing-sensitive-data-from-a-repository)."
1. Remove the file from the repository's Git history using the `filter-repo` command. For detailed information on using these, see "[AUTOTITLE](/authentication/keeping-your-account-and-data-secure/removing-sensitive-data-from-a-repository)."
1. Configure tracking for your file and push it to {% data variables.large_files.product_name_short %}. For more information on this procedure, see "[AUTOTITLE](/repositories/working-with-files/managing-large-files/configuring-git-large-file-storage)."

## Further reading
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ shortTitle: Remove files
---
## Removing a single file

1. Remove the file from the repository's Git history using either the `filter-repo` command or BFG Repo-Cleaner. For detailed information on using these, see "[AUTOTITLE](/authentication/keeping-your-account-and-data-secure/removing-sensitive-data-from-a-repository)."
1. Remove the file from the repository's Git history using the `filter-repo` command. For detailed information on using these, see "[AUTOTITLE](/authentication/keeping-your-account-and-data-secure/removing-sensitive-data-from-a-repository)."
1. Navigate to your _.gitattributes_ file.

> [!NOTE]
Expand All @@ -24,7 +24,7 @@ shortTitle: Remove files

## Removing all files within a {% data variables.large_files.product_name_short %} repository

1. Remove the files from the repository's Git history using either the `filter-repo` command or BFG Repo-Cleaner. For detailed information on using these, see "[AUTOTITLE](/authentication/keeping-your-account-and-data-secure/removing-sensitive-data-from-a-repository)."
1. Remove the files from the repository's Git history using the `filter-repo` command. For detailed information on using these, see "[AUTOTITLE](/authentication/keeping-your-account-and-data-secure/removing-sensitive-data-from-a-repository)."
1. Optionally, to uninstall {% data variables.large_files.product_name_short %} in the repository, run:

```shell
Expand Down

0 comments on commit 4070c99

Please sign in to comment.