~9 min read

Hidden GitHub Commits and How to Reveal Them

We have created a tool for GitHub that can reveal commits that potentially contain sensitive information and are not accessible via the public Git history, but that may be of interest or were intentionally deleted.
Authored by:

TL;DR

We have created a tool for GitHub that can reveal commits that potentially contain sensitive information and are not accessible via the public Git history, but that may be of interest or were intentionally deleted.

Introduction

GitHub is one of the most commonly used platforms and cloud-based services for software development and version control using Git. People use it to share code, projects, or custom developments online; the project’s history is recorded in the commits (and the commit messages) and is available for viewing to anyone. GitHub offers many additional features, some of which are well-known and widely used while others remain virtually unknown to users.

Sometimes, the shared data contains secrets, or information that should not have been published. There are ways to delete them, but they are not intuitive at all. During a recent security evaluation, we had a repo with deleted commits, and identified a GitHub flaw that isn’t well known. It allows access to commits not listed in the project’s history, including commits that have been removed using a forced push. We have created github-secrets to search for and find such commits on any public repository so that users can determine whether supposedly deleted commits are currently publicly visible in their repository.

We will now provide a brief overview of the process involved in deleting Git commits, emphasizing the challenges posed by a remote repository on GitHub with its caching and APIs, which make this task impossible without reaching out to GitHub customer support.

Git and Git reset

Git is a powerful and efficient software for distributed version control. It tracks the changes in all files stored in the repository and enables collaborative software development in the formats used by most programmers. The most common features are presumably well-known so we will not examine them more closely here.

xkcd git, CC BY 2.5

Suppose, however, that you have accidentally committed and pushed an incorrect entry to a repository. What can you do about it? You now have two options: you either create another commit and correct your error or — if you don’t want anybody to see the erroneous commit — you can follow the instructions you will find in a quick online search and execute the following two commands to undo the incorrect commit and clean the commit history:

git reset --hard HEAD^
git push origin -f

Let’s discuss this option for a moment. The reset command is a versatile tool featuring a range of different options for undoing the changes in a repository. A deep dive into the git reset command can be found in the atlassian tutorials. In particular, this reset command with the flag --hard HEAD^ reverses the most recent commit on this branch, essentially by undoing all file changes and making it appear that the commit never happened. The following push command with the flag -f (for force) ignores all warnings and safety mechanisms of Git when pushing to the remote repository. This is necessary as Git would normally prevent you from pushing over newer commits on a remote repository. As previously mentioned, Git is most commonly used when working on projects cooperatively. During development, it often happens that your local revision status is older than the remote status (because someone else pushed a commit after your checkout). Normally, you would have to pull the remote repository again and either merge or rebase the newer commit. In our case, however, we want to actively delete this “newer” commit. Using the -f flag, we tell Git we know what we are doing and the remote repository accepts our older status of the repository.

Our online research indicates that we can correct the changes and push a new commit without leaving behind any trace of our previous error. But caution is advised: just because a commit is not accessible via the history does not mean that it cannot be retrieved. Commands like git reflog or git fsck --lost-found will make these commits visible again and can even restore them if needed. For example, this blog post describes how to restore a lost commit from the computer used to create and push it. It is impossible, however, to use ‘git reflog´ to clone a repository from GitHub and see these commits.

Local git reflog

Remote git reflog

So if we had accidentally pushed some secrets on GitHub, but force-pushed over the wrongful commit, we should be fine, right? Not at all!

Due to GitHub’s internal architecture, they don’t delete force-pushed commits! While they are not cloned with a normal clone and cannot be enumerated with reflog, they are accessible if we know the commit hash by simply specifying the hash in the GitHub URL. With that in mind, let us look at the GitHub API.

GitHub API and its additional functions

When a Git repository is set up on GitHub, a lot of data and information besides the repository itself are set up. For example, it is possible to watch or star repositories on GitHub, and this information needs to be stored somewhere; GitHub Insights need to be initialized, Issues, Project, Actions and many more additional functions must be set up for the newly created repository. Apart from the general Git commands (which are equally usable in GitHub), GitHub offers an additional API for interacting with all the extra functions that are not implemented in the standard Git software. One example is the statistics endpoint that collects information about the general activity on the repository (such as an hourly commit count for each day). However, this information could also be gathered from the general UI. A far more interesting public endpoint that is not visible in the general GitHub UI is the event endpoint. The endpoint lists a multitude of different events such as WatchEvents, CommitCommentEvents, DeleteEvents, GollumEvents (if you’ve never heard of this one, no worries; it is the event for a wiki page creation or update) and even PushEvents on a particular repository. You can query that endpoint for every public repository.

All events of a public repository

All events of a public repository

Given that we are discussing hidden commits, let’s take a look at the PushEvents. We specifically ask ourselves the following questions: Does the commit history display all commits that have ever been pushed? Do we find any information about our previously deleted, now hidden commit? Let’s remember the situation we described above: We have pushed an incorrect commit and want to reverse this action. Our first push with the incorrect commit created a PushEvent with an array of commit objects describing the pushed commits. When we now force-push an older version onto the remote Git, we create an additional PushEvent. Instead of generating a new commit, it simply rolls back the head of the remote repository. The previous incorrect commit is no longer part of the history and is not visible in the GitHub commit history UI. Some online sources discussing this topic (for example, this StackOverflow thread) mention that a force-pushed commit may still be accessible via its SHA1-Hash. However, they don’t tell you how to find this hash if you didn’t see the commit while it was still in the regular history. This might give people the impression that if the force-push was fast enough and nobody saw the commit and its associated hash, there is no trace of the commit anywhere (since it has been deleted from the history). Don’t fall into this trap as it could give you a sense of false security of having “quick-fixed the error”. This is also the case if you change your repository setting from private to public. The events endpoint still lists all PushEvents even from the time it was private and the hash of our incorrect commit can still be seen here. The commit we tried to hide with the force-push can be found by anybody accessing the public events API!

Example of a force-pushed repository, reset to an older commit

Example of a force-pushed repository, reset to an older commit

Security implications

First, let’s be clear about one thing: as long as a pushed commit remains available in the normal Git history, anybody can clone or fork the repository with this status, so all sensitive data should be considered compromised no matter what. We have also just described how one can retrieve all the information necessary to access the deleted commit by finding the corresponding PushEvent, extracting the SHA1-Hash and accessing the commit via the hash. But this is not the only way to access commits that are no longer listed in the history. Sometimes it is even possible to brute-force commits using no more than the first four characters of the hash described in the Short SHA-1 section of this git documentation.

The only official way to completely delete an online commit (even from all cached views) is to contact the GitHub support directly.

Github Secrets

We have developed a tool to identify these dangling or hidden commits and to see if any data are rendered currently visible by this error that should be deleted by the GitHub support. Generally speaking, there could be more commits linked to your repository than those listed in the general commit history of any branch. Not all of them are errors or force-pushed commits; this can also happen when merging pull requests etc. Using a Github API key to scan larger repositories (which can be added as an environment variable) is advisable.

Example

Example repository with a “deleted” commit (not online anymore): https://github.com/neodyme-labs/github-secrets-demo

Example repository

Example repository

All commits in the GitHub UI

All commits in the GitHub UI

All events of this repository could have been accessed via: https://api.github.com/repos/neodyme-labs/github-secrets-demo/events

All events of a public repository

All events of a public repository

As you can see below, there is an empty PushEvent and the previous event has a commit that is not found in the GitHub UI:

Example of a force-pushed reset of the git head

Example of a force-pushed reset of the git head

Here is the deleted commit as recovered from the events api: https://github.com/neodyme-labs/github-secrets-demo/commit/ddc0ca8a18c4001fbca0ac433f1d2e7bdd882a68

Dangling commit

Dangling commit

When you now run our tool, all commits will be displayed directly to you as shown here:

Dangling commits found with Github Secrets

Dangling commits found with Github Secrets

More information and the tool itself can be found at GitHub.

We hope you have found this blog post and the accompanying tool helpful and informative. In the event that you have inadvertently disclosed sensitive information via a git commit, we highly recommend promptly cycling all affected secrets. It is crucial to take this precaution, as you never know who may have already pulled the information.

Share: