GitHub is one of the most popular source code repositories in the world, if not number one. The convenience that the service offers developers has been incredibly useful to possibly millions of people worldwide. In fact, I’m willing to bet that many applications wouldn’t have been developed as effectively if it weren’t for GitHub.
When it comes to cloud security, or security on any third-party networks, the responsibility for protecting infrastructure belongs to the owner of the infrastructure. For example, it’s Amazon’s responsibility to make sure that unauthorized people can’t physically breach any of their datacenters which host AWS. But the security of a developer’s third-party hosted content is the responsibility of the developer. And a study conducted by North Carolina State University has revealed that a huge number of developers that use GitHub don’t secure their various API and cryptographic keys.
“GitHub has become the most popular platform for collaboratively editing software, yet this collaboration often conflicts with the need for software to use secret information. This conflict creates the potential for public secret leakage. In this paper, we characterize the prevalence of such leakage. By leveraging two complementary detection approaches, we discover hundreds of thousands of API and cryptographic keys leaked at a rate of thousands per day. This work not only demonstrates the scale of this problem, but also highlights the potential causes of leakage and discusses the effectiveness of existing mitigations. In so doing, we show that secret leakage via public repositories places developers at risk.”
Imagine a 50-story commercial tower where tenants tend to stash their physical keys in random spots all over a publicly accessible lobby. Someone up to no good could easily find the keys and sneak into the offices of insurance companies or dental practices in the evening and destroy those businesses if they wanted to. If the night time cleaning crew has high turnover, who would even notice before it’s too late? That’s the sort of mistake that many GitHub users are making, constantly and every day.
NCSU researchers scanned nearly 13% of GitHub’s public repositories over a timespan of about six months. As per the paper: “We find that not only is secret leakage pervasive – affecting over 100,000 repositories – but that thousands of new, unique secrets are leaked every day.” Ouch. The insecure keys range from API tokens to SSH keys to TLS certificates and more.
Here are some more gruesome details.
The researchers found a user who appeared to be exploiting YouTube in a way that could involve copyright infringement. “We found a total of 564 Google API keys in these repositories, along with indications that they were being used to bypass rate limits. Because the number of keys is so high, we suspect (but cannot confirm) that these keys may have been obtained fraudulently. We could not locate the keys elsewhere in our limited view of GitHub, but it is possible that the keys came from elsewhere on GitHub.” That not only puts developers who use YouTube APIs legitimately at risk, but possibly also Google as a whole.
A major website that’s used by millions of American college applicants that the researchers won’t name was amongst the many web services with highly vulnerable keys on GitHub. Another web service with vulnerable keys on GitHub was the website of a western European government agency. That entails AWS credentials, and puts a lot of very sensitive public sector data at serious risk! “In that case, we were able to verify the validity of the account, and even the specific developer who committed the secrets. This developer claims in their online presence to have nearly 10 years of development experience. These examples anecdotally support our findings that developer inexperience is not a strong predictor of leaks.”
That’s one of the things that I personally found the most worrisome. These parties who are leaving their keys easy for cyber attackers to grab aren’t typically kids who are just starting to dip their toes into serious application development. These mistakes are being made more often by seasoned developers who are working for corporations and major government agencies. They are supposed to know better.
What could GitHub users do to better protect their keys? GitHub has tools that many people aren’t using, and at their peril. “The .gitignore file is intended to allow developers to specify certain files that should not be committed, and it can be used (among other things) to prevent files containing secrets from being leaked. While this is a good strategy, many developers do not use this option and some do not fully understand it. To better investigate its usage, we used the Search API to collect .gitignore files over a 3-week period in the same manner as our normal collection process. Our assumption had been that these files would not contain secrets as they should only contain path references to files. Yet, we identified 58 additional secrets from this process. While this is a small number compared to the full dataset, this finding indicates that some developers commit secrets out of fundamental misunderstandings of features like .gitignore.”
Cloud services like AWS, Microsoft Azure, Google Cloud Platform, and GitHub offer users many tools and features that can be used to help keep applications secure. And sadly, even experienced professionals aren’t using them. Remember, the security of applications in the cloud is the responsibility of developers.
But if developers are operating within your DevOps environments, you may have to take a more active role in enforcing best practices for security. Automating machine identity protection with your DevOps lifecycle will help eliminate many of the human errors that result in exposure like those we’ve seen on GitHub.
Could GitHub do a better job of teaching users how to use features such as .gitignore properly?