Disclaimer: I disclosed this security issue to GitHub, and they choose to not fix it (We have reviewed your report and determined that this functionality is working as expected). This is undocumented behavior, so I am describing it here. Also, note that I am not asking anyone to hack GitHub nor I am going to do this by myself. All I want is better security and fair playground – everyone should know how safe their files are there at the GitHub.com
Short version: This is security weakness for Private GitHub repositories. I am not talking about public repositories that are available to anyone without entering username or password. Also, source files stored in the Git are safe. However attachments from GitHub’s private issue tracker can be viewed without any authentication or authorization. Not all attachments are at risk. Word, Excel and PDF documents (.doc, .docx, .xls, .pdf) are kept private. Image files .jpg and .png are accessible to anyone – without providing any username or password – even if repository is not shared to anyone, even if you are paying GitHub for enterprise account (Developer, Team, Business). The attacker only need to know the URL of image… and that is disturbing. Also, once uploaded you can not remove the photo, image or blueprint. It stays there forever…
Long version: You are paying from $7 to $21 per user per month to host your Private or Business projects at GitHub (I have tested only $7 account, but my guess is that all of them works the same way). You have read the GitHub Security article (advertisement). You think that they do what they say they do (We know your code is extremely important to you and your business, and we’re very protective of it), but… as every big company does, they fail… and they fail to fix security issues even when they know about them. I don’t know why it happens, but I think that Bounty programs play big role here. They need to give away money to random people to fix their software defects. Sometimes they choose to do nothing, because of high volume of false positives (bug reports) they can not afford security experts to look at every issue submitted.
About security issue:
Create a new Issue for your any private repository.
At this point your private images / photos / blueprints are available publicly – without any authentication or authorization. You don’t even need to post the issue. Your images are already uploaded! Even more alarming fact is, that you can not remove the images / drawings or your blueprints once they are uploaded. There is no delete button. And deleting issue’s comment or even deleting whole GitHub repository does not help. Images are still there.
When deleting GitHub’s private repository, there is no information that some files will be left in the web forever.
Even more. They say that “This will permanently delete the repository, wiki, issues, and comments, and remove all collaborator associations. Of course this is not the case. Your issue attachments are kept there forever.
I can speculate that coding / programming the image cleanup code is more expensive than HDD space nowadays. I have not tested for very long periods. Maybe there are some code that scans comments and removes unused (unlinked) images once a quarter or once a year, but I do not have evidence of a such thing. So the truth is – once uploaded to the GitHub your images are there forever or at least for weeks.
I have tested both .jpg and .png images. They are hosted at https://cloud.githubusercontent.com/ server. They are accessible to anyone. On the other hand – document files like .pdf, .docx, etc. are hosted at https://github.com/YOUR_ACCOUNT_NAME/your-repository/files/your-document.pdf which are kept safe.
And to be fair, here is the GitHub’s view on the issue. They say two things:
The links have UUID / GUID in them (more than 120 bits of entropy), so attacker can not guess them.
Do not share links from private issue tracker to the third parties.
Let me begin with the second point – Do not share links… If you are computer savvy or security oriented, then you may try to not share the link with the third parties. However, in the practice this is not an easy task. First, there are toolbars and browser addons/plugins that grabs all the link for various purposes like virus-checking (private files shared with 3rd parties becasue of the lack of proper authentication). Also not in so long time ago popular Google Toolbar and Alexa Toolbar used such information to index deep web. So your hidden URLs started to appear in the search index. There are countless examples of this happening:
How Did Google Find this Hidden File? (When you use Chrome, the typical setup is to have its “Under Hood” features enabled. Such as “Use a web-service to resolve navigation”, “Use a prediction service to complete searches”, and “Enable Phishing and malware protection”.All these services make a very liberal use of various services at Google.)
There was some myths and word games with Google and webmaster community, whether Google uses URLs for indexing purposes or not, but one thing is certain – Google Toolbar was sending URLs to Google. So, if you visited your GitHub’s issues page with Google Toolbar enabled, then your URLs was shared with Google, and you can only imagine how the Search gigant used that information. There are some more excuse going on from Matt Cutts. But even one of the brightest minds from Google says – “Security through obscurity is not a great way to keep a url from being crawled”. GitHub, if you do not want to listen to me, may be listen to the Google? According to the Wikipedia “Security through obscurity” was rejected as method of security in 1851, but we are still doing it in 2017.
I see some hands rising – “but… but… GitHub have robots.txt file at the root of https://cloud.githubusercontent.com/ -------
But again – so you want that your important files are protected by one text file that is only a recommendation not a law, or you want proper authentication where username and password is required to access your files? The robots.txt file prevents only robots that obey these rules. The text file does not protect from humans or hackers.
And there are countless other ways how can you share links with third parties without your consent. Antivirus / antispyware sends URLs for virus-scanning purposes where sometimes human interaction is required (third party sees your images), as mentioned earlier – the Toolbars, browser addons/extensions, you can send link to misspelled email address and you think that only you and your team have access to the GitHub repository, but sorry… now the person with misspllled email address also have your file.
The first argument about 120 bit entropy and UUID / GUID is also very disturbing. It is clear that person at the GitHub who review the submitted issues have no clue about UUIDs/GUIDs and entropy. I am not here to teach you about all the historical problems with uniqueness, guesability or security. Just read some information from the links provided.
How securely unguessable are GUIDs? (Do not assume that UUIDs are hard to guess; they should not be used as security capabilities (identifiers whose mere possession grants access))
From Wikipedia – Universally unique identifier (Pseudorandom number generation often lacks necessary entropy, and RFC 4122 recommends that when a high-grade source of randomness is not available, that one of the other UUID versions be used instead. Some implementations of version 4 UUIDs do not satisfy this requirement)
Here are some UUID examples. All of them are from GitHub. Maybe not guessable on the spot, maybe enough entropy, but what if they use weak PRNG, or what if some other programmer replaces UUID generating function without any clue that they are improperly used for security? What if they use standard UUID generation alghoritm and are easly predictable? -------
I hope that GitHub will plug the hole… soon 🙂
Short version: Gmail leaks your username. Always! To get username/login information for Google Apps user (paid Work/Business account) you need one email message. Just look at the Return-Path header. Fortunately, you do not know password yet, but combined with other weaknesses (like password reuse) this is not a problem. Determining if someone is using Google Apps for Work (Business) is trivial. And this method works in 100% cases. Even administrators can not hide their administrator’s username.
Long version: Some time ago when doing security audit / consulting I found a small security hole in the Google’s Gmail for work (previously Gmail for Business). Of course, as a responsible citizen, I notified Google’s security team with the info that their popular web email client has the security problem. They responded timely and very polite, asked a couple of questions, and to my surprise said, that this is not a security issue and they are not going to fix it.
Normally, I would agree that knowing your login name is not a problem. However, in the real world in the computer systems that are in use by millions of people these “normally” rules does not apply anymore. When password reuse happens, where brute-force password guessing is reality, in the world where lazy admins live, etc. One more thing is – the IT security is not a simple thing. It is more like layers of various pieces that are put together in the right combination by making reasonable good security. By revealing your username – one of the two credentials, you are eliminating one piece of that security layer. And this is absolutely unnecessary. Whatever Google wanted there, it can be achieved in many different ways without compromising that one small piece of security.
Also, I must note, that this weakness is documented by Google. This is not something rare or unique, when software author describes some feature in the documentation, and when later the scope of the feature is changed, the Vulnerability appears. I suspect that something similar happened there.
I suspect that this feature was originally implemented for regular @gmail.com users, that are sending email via gmail servers, but using non-gmail email address as sender (alias). So for example, your gmail address is firstname.lastname@example.org and you setup alias for email@example.com. In the original gmail this feature makes perfect sense. If you are too spammy then any administrator in any organization can look at your email headers (Return-Path) and determine your real email address (firstname.lastname@example.org) or block it altogether, without any action needed from Google’s team. They offer free service, and do not need to spend money on support team. But by revealing your email@example.com, it also reveals your gmail login. But again, this is fine – free service, everyone already know, that your email address is login name, and have turned on 2-factor auth, etc.
But it does not make any sense for Business email, where email address and alias are on the same server. There should be an option to disable this Return-Path thing, or better, it is disabled automatically when Admin have configured alias domain on the same Google server. I would somewhat understand that Return-Path is added in Business email when the alias resides on non-google servers. But why on the Earth the header is added when both domains resides on one Google email account and the email is send from the same account?
One more problem is that users of the Google’s paid email service are not aware of the issue. One tech savvy admin would think, that by creating the alias different from the administrator username would somehow protect the admin’s real email address, and by assuming that no one knows that address, would fall as victim in the phishing attack in the email form, where email looks like from Google support, because she thinks that only Google knows this hidden admin email address. She have not revealed this email address to anyone except Google. Think about it. If she receives email to that address, it must be Genuine – from Goolge.
This is how the Return-Path header looks. ADMIN-USERNAME@example.com is the address that you are trying to protect. And you are sending email from firstname.lastname@example.org
Received: by 10.128.17.7 with SMTP id a64csp423452wmf;
Sat, 26 Apr 2016 05:02:04 -0700 (PDT)
X-Received: by 10.17.107.18 with SMTP id 40mr24542334ior.101.124354452334;
Sat, 16 Apr 2016 05:02:04 -0700 (PDT) Return-Path: <ADMIN-USERNAME@example.com>
Received: from mail-ia0-x234.google.com (mail-ia0-x234.google.com. [2607:f9b1:4002:c06::234])
by mx.google.com with ESMTPS id ... From: email@example.com
And again, if you have not enabled two factor authentication, do it now. Do it for the every service that support it (Google, Dropbox, Amazon, Twitter, etc.).
And if you still think that revealing usernames is somewhat acceptable, try to educate yourself by reading other peoples opinions. For example here – Disclose to user if account exists? And by the way, the Wikipedia article about Phishing begins with the sentence: “Phishing is the attempt to obtain sensitive information such as usernames…”