Sunday, January 26, 2014

Username Harvesting - easy to do and hard to thwart

In my last blog post I explained how to gather usernames from document metadata using your favorite search engine, a browser and a metadata parsing tool – then scripted it for convenience. In this post I will delve into application usernames; what they are, how to harvest them from popular websites, the flaws that make this possible and what can be done to mitigate the threat.

Applications come in many shapes and sizes, this post will discuss common web applications in the form of social interaction websites and websites that allow for sensitive transactions. Most all social websites and every online banking website has a login/sign-up feature that enables users to create a username and password which in turn allows them to login and interact with the website. The ability to choose a username and set a password then proceed to configure preferences translates to options or choices, what I mean by this is that the power is in the users' hands to create a strong password and change privacy settings for example. Other elements of the website experience are at the discretion of the company providing the service such as SSL settings that our app or browser will negotiate, the option to offer SSL to begin with, page caching and SEO relative settings (dictating how content that we share can show up in search engine results) and what configuration choices are made available. The point is that the onus is both on the web application and on us to secure our information.

When websites don't offer the option to connect via HTTPS, then it is difficult to login with transport layer encryption, just as its hard to create a strong password if the website enforces a four digit limit on passwords. The flip side to this is when a website offers both HTTP and HTTPS login functionality and supports passwords that are 100+ characters in length and folks choose to login over HTTP and set a four digit password. The solution comes down to education, for instance logging into gmail,navigating to the settings section and selecting the “Always use https” radio button; a simple change that offers browser to server encryption throughout your entire session, not just when you login (nowadays the default is to always use HTTPS but it wasn't always that way).

Times are getting better, web applications are becoming more secure and offering better ways to protect your account and information such as MFA (multi-factor authentication). An example of this would be how Twitter offers you the choice to require a SMS (text message) be sent to a phone number of your choosing with a login code to augment the login process. No longer is a username and password sufficient, rather a third factor (arguably... perhaps in another post) is required to gain access to your account. Twitter accounts, banking accounts, as well as email accounts and folks that shop at Target seem to get compromised on a daily basis, the reason for this is ultimately motivation – with enough motivation no account is safe. Aside from a sufficiently motivated attacker (or APT situation), lots of other reasons exist for the constant breaching of accounts, mainly weak passwords, guessable password reset questions, poor choices made by the account owner and poor choices made by the service provider. Sometimes it comes down to security vs convenience in that folks don't want to be bothered with checking their SMS every time they want to login to Twitter, or keep separate passwords for every single different website they interact with; the result is missed opportunities and low hanging fruit. The easier of a target that you are, the more likely you are to become a victim. At the end of the day, websites today still ask you for a username and a password and that is all it takes to login, lets finally delve into 50% of that equation; your username.

What is username harvesting? The phrase “username harvesting” refers to a vulnerability that when exploited allows people or programs interacting with an application to determine what a valid username is vs an invalid username. It's time to start thinking: facebook, twitter, linkedin, gmail, your bank's website. Determining valid usernames is trivial in most cases usually because the application is written in such a way that it is simply not feasible to not divulge valid usernames; such as creating an email account. Two people are not typically allowed to register for the same email address and the website used to create the address will clearly convey this fact when a username that you want to use is already taken.






If you are a bad guy or a penetration tester with explicit written permission to actively attempt to access other folks accounts, then you are in luck as you likely have half of the needed information to login to any account that you discover. With webmail accounts there is not a convenient way around this issue but it's not just email, as a penetration tester I see this vulnerability on almost every web application that I come across, even in the banking industry where you may think security is tight(er). The difference is instead of coming right out and stating when a username is valid, there are subtle variations in the way the page is displayed to the user which indicate valid vs invalid usernames. The trick when performing a penetration test is automating the harvesting of usernames and coupling that with “account grinding” in which one brute forces the password until access to the account has been gained. The good news is that most reputable services provide some type of account grinding protection: captchas, account lockout thresholds, IP banning, timing defenses, and other application layer mechanisms that slow the process down or thwart it altogether. Don't kid yourself, obtaining usernames is not difficult, gaining access to an account on the other hand can be exceedingly challenging when the account owner and service provider take the appropriate actions to make it difficult. Let's move on to some examples to clearly illustrate these concepts.


Today, as a Penetration Tester, I am not currently working with Google or Facebook and if I were I certainly would not be blogging about the experience, I do however have permission to show some examples of username harvesting and account grinding using my development network with custom written applications. I could have probably skipped writing this paragraph but as a professional I want to make it clear that engaging in nefarious activity is not ok unless you have explicit written permissions to do so from the appropriate personnel with direct jurisdiction over the target in question.

This first example shows how to manually determine valid usernames using just a browser and as a quick example, Gruyere. The login page has two visible fields: “User name” and “Password”.





If you type in a valid username and invalid password you get the same visible response as you would if you provided an invalid username and/or password – which is a good thing. This way an attacker can't tell whether the username was invalid or the password was invalid thus no username is divulged. It's good from a security perspective since the application is being vague, but its slightly inconvenient from a usability perspective, I mention this because often times a company has to choose between slightly better security or slightly more convenience for its users.



Moving on to the “Sign up” page reveals a username harvesting vulnerability as the application will tell you if a username already exists once the “Create account” button is clicked.




Pretty straightforward so far, try to sign up with a username already taken by someone else and the website tells you that the user already exists, much like the popular webmail services mentioned before. At this point essentially two things will dictate how much effort it will take to break into the account: How strong the password is that the user has chosen and what security mechanisms the website (and company in general) has in place to protect your account. Within those two essential aspects are endless permutations that directly relate to the overall security of your account: Who else has access to your computer, is your browser saving your password, how does this “Gruyere” company conduct employee background checks, who at the company can access the database where passwords are stored, are the passwords in the database salted and hashed properly, do the employees of the company participate in regular security awareness training, are you logging in using HTTP at in internet cafe, are you “borrowing” free wifi,is there already malware on your computer, where else are you using the same password, etcetera.

At this point we know that the user “Validuser01” exists, because the web application told us “User already exists” so in this example scenario the next step would be to determine what protection mechanisms are in place to thwart account grinding. What to look for is anything that will slow down a brute force password attack, in this case there is no captcha or account lockout or anything stopping endless password guessing so we can move on to trying to gain access to the account using a brute force technique, Burp's intruder feature works well for this:



Burp lets you configure the password parameter that will be the target of the brute force and offers payload options where dictionary lists can be inserted, the better the password list and weaker the password is, the less time it takes to brute force the account.

In the previous example the web application clearly stated that an account name was already being used which is confirmation that the username exists. However banking applications will not always make it so easy to determine a valid username. What to look for is variations in verbiage and website redirection, both of which can indicate when a valid username vs an invalid username has been submitted. Here is a simple login box for a banking website, its asks for a username and has a “Sign In” button:





By entering a username that is probably not valid and clicking “Sign In” I am directed to a page that looks like this:



By submitting a username that is likely to be valid, or at least different, I can observe the verbiage on the page that I am directed to:


The differences between the two pages are in the verbiage, the invalid attempt directed me to a page where I was asked to reenter my username, while the valid attempt directed me straight to a password (passcode) page. Other possible pages that the bank may direct me to are “secret” question pages where I would be required to provide an answer to a question that only the account holder should know. An“account lockout” page is another possible option, when this page is reached it means the web application employs brute force protection in the form of locking out the account if too many invalid passwords are submitted but it also indicates the account is valid. There are variations on these common practices and once again it is difficult to make the process easy and convenient for valid users but hard for attackers.

Taking this to the next level would come in the form of automating the username harvesting element and coupling it with brute-forcing the accounts or social engineering attacks. An example of a more challenging interface to harvest usernames from could look like this:




When a failed attempt is made the page looks like this:




It's not perfect but the end user does not know which of the three inputs was incorrect and no helpful verbiage is displayed regardless of the information entered. Another thing to keep in mind is password reset functionality as it too is often vulnerable to username harvesting.


Writing a web interface that is both user friendly and secure is challenging and the flaws that make username harvesting possible are sometimes marketed as features. Make use of the options made available to you lock down your accounts. That's all for now - keep your passwords (and phrases) long, strong and different for every site and application... till next time.