Saturday, March 12, 2011

Signs of Broken Authentication (Part 2)

Red Flag #2: Restricted Character Set for Passwords

In the last post, we examined limiting password length as an authentication red flag. Today I want to look at another common red flag, that of restricting the allowed character set for passwords. As a simple example, sometimes web sites will only allow alphanumeric characters in passwords.

The scenario goes something like this... you register for an exciting new web site that all your friends are clamoring about, and you, being the clever type, try a password like 'G00d-bye!'. (One that clearly no one would ever be able to guess ;-). After confirming this as your new password, you click on the 'Submt' button, and the web site returns an error and informs you that "you have one or more invalid characters in your password; please try again”. (Thankfully, some of the more informative sites will actually tell you what characters that they do accept; how helpful is that, huh?)

After trying various other unacceptable passwords, you eventually discover that this site's developers have apparently read XKCD's Exploits of a Mom and think they are preventing SQLi because they don't allow the evil “-” character in their passwords.

The really "ingenious" sites also don't allow '<', because after all, someone might create a password whose value is something like “<script>insert_evil_javascript_here</script>” exposing a XSS vulnerability. (These same sites may reason that this is also good rationale to limit the password's length; make it short enough and no dangerous amount of script can be inserted.) They don't allow “:” for a similar reason, because after all, the really clever hacker may instead try to use “javascript:insert_evil_javascript_here for their password.

Or you might find that the developers disallow characters like '$' or '@' or '|'. They may do this because they have implemented their web site in Perl and their password handling is using either 'eval' or `` or system() or pipes to pass your password string through to some other back-end system for the actual authentication processing and those characters are problematic in such cases. (If so, they possibly have worse issues, like command injection, but this is only a contrived example, so we'll go with it, okay?)

Pretty soon, these developers have had so much trouble with so many different special characters that they simply decide to disallow all special characters and instead they just check to make sure that you use one of their benign characters in your password, such as alphanumeric. (On the bright side, at least this approach leads itself to white-listing rather than black-listing.)


But, “why is this a problem?” you ask. Well, because it greatly reduces the number of possible passwords of a given length. If N is the number of characters in the permitted password “alphabet” and L is the maximum length of the password, then there are only O(NL) possible passwords with that “alphabet”. (The exact number of possibilities of passwords with up to L characters chosen from an alphabet of size N is a bit larger since obviously we can chose passwords that are between the minimum length m and maximum length L. Working out the exact number of possible passwords is left as an exercise for the reader.)

If alphabetic (both upper and lower case), numeric, and special characters on the typical QWERTY keyboard are permitted, the size of the “alphabet” is 95 characters (including space, but excluding tab and newline which are very difficult to enter from a web browser and assuming I counted correctly :). If you exclude all the special characters, you are left with only 62 alphanumeric characters. If you use a minimal length password, which many of you probably do (and for many sites, that is perfectly reasonable; using 'HuH75^mn43,1@#' is probably fine for my bank, but a bit overkill for the NY Times site, where clearly all passwords should be either “WSJ_rules” or “WashingtonPost”, just to protest such nonsense), then an 8 character password works out on the order of 628  or 218,340,105,584,896 possibilities. By comparison, if we were to allow special characters in the password, then an 8 character password has 958 or 6,634,204,312,890,625 possibilities, which, if I've done my math right, is about 30.4 times more. This means, for instance, if an adversary were able to brute force all the possible 8 character alphanumeric passwords in one day, it would take that adversary roughly a month to brute force a password comprised of all possible alphanumeric and special characters. (In reality, if off-line dictionary attacks are viable at all, these numbers are not too far fetched using a fairly cheaply built high-end farm of GPUs. But that's a topic left for another day.)


So we see that restricting which characters may be used in a password is another red flag for authentication. But now you ask, "how do we fix this?". Well, first of all, we make one very important design decision. Once a user's password is submitted, we decide we will never, ever attempt to display it again. This is good on many different levels (especially for privacy reasons), but another major benefit is that we never have to worry about issues like XSS.

I will outline one simple scenario that I prefer, but obviously there are several variations of this that will work as well.

In your password handling code, which would cover not only your login page, but also your change / reset password pages as well, you immediately convert the password string to a byte array. (A char array will do as well [in languages, such as Java where these types are different], but byte arrays usually interface more easily with message digests or symmetric encryption APIs.) You should do this conversion as earlier as possible and you should use a specific—rather than the default—character set for the conversion. (Aside: I'd recommend UTF-8 since it is so widely supported; using the native default encoding is asking for trouble because if you ever change deployment architectures you likely could find yourself with a user store where older passwords no longer work on the new system.) Once you have converted it to a byte array, either hash the byte array (ideally using a suitably sized random salt) or encrypt it. Then encode it in some standard format for storing...base64 encoding is typical, and finally store it.  Then all subsequent operations with the user's password is done via this (say) base64-encoded hashed or encrypted password which you have secured stored somewhere.

Final Word

One final word on this. A colleague and I have been experimenting with using a  Web Application Firewall (WAF) to monitor some web sites. One thing that we noticed is that occasionally, the WAF will flag a password containing certain special characters (usually, single quote (') or hyphen (-), but occasionally '<') as an attempted SQLi or XSS. In almost all cases, these are false positives where end users are innocently trying to use these characters. For example, a user may try to enter "d0n't-ever-d0-th4t!" as her password, but the WAF thinks that this is an attempted SQLi attempt because of the presence of the single quote and/or hyphen. Unfortunately in the case of the WAF that we've been experimenting with, the default action of this particular WAF is to block such requests. Such default behavior, if allowed, is likely to frustrate users and your help desk, so it is easy to see how the site's developers might respond by simply not allowing such troublesome characters in passwords to start with.

But IMHO, this is the wrong tact. Rather than trying to work around the symptoms, fix the problem where it is...the broken WAF rules. A wacky WAF is no excuse for dumbing down user's passwords!

Next time I will discuss a much worse variant of this red flag that this particular blog post examined as well as the failure for authentication systems to provide automatic account lockout.

Until then,

No comments:

Post a Comment