Password Rules Are Bullshit


Couldn’t agree more.



Ah unicode… I remember reading that ligatures? or something that are created in windows vs say mac, that look identical can have different codepoints, thus you can end up with thinking you have the same password, but because a password hashing algorithm looks at this as binary it could actually hash different. I’m having trouble finding exactly what characters could be typed different but look the same. And of course you have the issues with actually typing these in on different input devices. Maybe this is one of those things that was a problem but isn’t anymore. That said I tend to lean toward caution on unicode passwords, unless you’ve tested it on all your devices.

That said I’m all for passphrases I use a 30+ character passphrase for my password manager password. Similar size for device logins where I can’t use the password manager. And, like another poster, I hate when sites, or even worse apps (Ally Bank I’m looking at you) will not let me paste or form fill my credentials.

I personally have thought for non full keyboard logins though, sites should support a device auth section that can be authenticated with another logged in device. So I log in with a computer, I take a picture (or just put my username in on my phone) the device shows up in the device table, I get a notification and can authorize the device with my computer. No password entry required on the device. This does require 2 devices, but most of the time I find I could do this for first logins on my phone.



Provocative counter question: instead of accepting plain text passwords, accept only a (hex/base64/??? encoded) cryptographic hash of the password.

On the definitely-a-plus side:

  • Entropy of the password is limited only by the cryptographic hash’s characteristics, primarily length. So up to 512 bits is easily available ‘carrying capacity’, though user’s passwords are likely going to be significantly less ‘random’ than this.
  • Users can type/paste in whatever
  • You get a nice and formally specified input, so you can do basic validation/sanity checks trivially. (I.e. is the length correct, is the encoding correct).

On the depends-on-your-perspective side of things:

  • Silly password rule checks have to happen client side, as you will not be able to do so server side.
  • In the context of usability, you essentially require your users to use JavaScript in order to log in.

On the definitely-a-down side:

  • … What am I missing?


I’m probably mistaken but you may be thinking of broken support for Arabic ligatures in things like fonts? Because in Arabic script you have ligatures for pretty much everything depending on context, so the same ‘character value’ looks wildly different depending on:

  • How broken your UI controls/OS/fonts are
  • What position and what word they appear in
  • What support is simply missing in your UI controls/OS/fonts.

However, that’s probably not how people enter passwords anyway: you type a particular pattern on your keyboard but that has only an indirect (if any) relationship with how such a pattern would be rendered on the screen. So you memorise which key presses to perform, rather than what shape the output should be.



While I agree that length is a simple solution and a good starting point, I invite you to check out the zxcvbn library as a few others have mentioned. It is written in JS and designed to run client side in ~20ms and has a shorter (~30k) list of passwords, names, and common words. It uses a variety of metrics to guess the brute-force strength of the password and provides a lot of feedback on what it finds. There is a demo linked from the github page.

This library has been tested with leaked password lists and common cracking tools (Here is a presentation and paper on that topic from USENIX Security '16) and seems to do well in predicting the strength of passwords against realistic attacks.

There are also ports to many languages, including ruby, if you want to run it server-side.

1 Like


I don’t think so, all I remember is reading something about how apple and windows were generating different codepoints for the same character (or maybe ligature), but in a password that would break things, and the support headache over mistakes in passwords… well I can see if this sort of thing is even remotely a concern that it’s not worth it to support “passwords” outside of the ascii range. Because you can’t remotely troubleshoot this in support since support shouldn’t ask you for your password, and you shouldn’t log the password, and etc, etc. I do feel like it was a ligature or accent mark or something else that was like combining characters that was doing this.



maybe I’m remembering this

summary (code shortened see post for full code)

print "\N{U+212B} \N{U+00C5} \N{U+0041}\N{U+030A}\n";

And that will print Å Å Å
What should passwords do? If I allow Unicode in passwords, should I allow U+212B when the original password had U+00C5?

given ovids question how would you even begin to support that? a password hash wouldn’t know… would you create a giant map of all codepoints that look alike (let’s add html entities too ), and either A hash them multiple times? or b convert the input before hashing? possibly reducing entropy.

I think due to the fact that it’s impossible to support this in any meaningful way that I may remain firmly in the camp of unicode shouldn’t be allowed in passwords (unless you’re whitelisting characters, and controlling the input devices, like an on screen keyboard made just for your app and available on all device)

a commenter on that said this

One thing to consider is, what are your costs? Has anyone done research to find out whether allowing Unicode passwords results in an increase of customer service calls, because people are having problems? You may think passwords become more secure because of the increased key space, what if someone picks a password with a “smart-quote” (U+2019), when creating an account using his PC (not really realizing he’s using “smart-quotes”). Then, later, while travelling, he tries to log in to your service using a mobile device, but the keyboard on that device has regular quotes (U+0027) handy, and “smart-quotes” hidden behind layers of menus. Now, your customer gets told his password is incorrect. You may lose sales this way, or even lose the customer. Or he may keeps a customer service agent occupied for 15 minutes. I’m not going to express an opinion on whether that’s a price you should be willing to pay, but it is something one has to consider.

that said there are still plenty of other stupid rules in the password world… jeff’s not completely wrong, but I think I would not advise allowing unicode…



I’m concerned that this method is only as strong as its weakest link, which is the passwords the users pick.And, if they are weak, then the whole system is weak. I don’t think the “gate key” password makes the seems any more secure, just unnecessarily complex. While the gate key is one more step, it’s not a very secure step: the gate key is a password known by everybody, stored in cookie which can be read or intercepted, and (e)mailed out to all users so it exists at least in email boxes. The gate key is relatively easy to discover and once you have the gate key then you can attack all user accounts, some/many of which will have poor passwords.

I don’t think this system is any more secure than the typical username/password method, and it would be a simpler and more secure system if there wasn’t this gate key and instead just require more complex passwords from users.

There’s a reason that solutions like this aren’t at all common because the roll-your-own approach tends to not work as well as the tried-and-true methods.



Perhaps, but I’ve gone 15 years without a problem, so that might count for
something. :slight_smile:

The reality that I have to deal with truly isn’t hackers coming in and
attempting to steal my client’s inventory secrets, but the fact that
employees all tend to know each others passwords, and when one gets fired,
no matter how insanely complex the requirements are, they would be able to
get in to any account they knew the password to.

In the end, it is often a known person who causes the most damage. Whether
it is the $10/hr temp security guard who let anyone in, or the intern who
brags too his hacker friend about how inadequate a particular part of the
application might be, or the sales person who steals the client list on his
way out the door, or the webmaster who gets fired and later uses a back
door to spoof emails and wire money abroad.

I’ve had all of these happen to clients over the years, but I’ve only once
had a successful attack, and that was a sql injection on a php system we
inherited. (Still should have seen it) over a decade ago.

Anyway, I suppose it’s best to “know your enemy”. Passwords are a pain in
the ass. My best recommendation is simply be slow about confirming and
only allow a few tries, oh, and at least 7 characters. …then again, my
yahoo password is still 4 digits. (Got it in 1999) I figure nobody is
going to try that!!


1 Like


The question of whether or not two code points are the same according to collation rules is not very relevant to typing/pasting in passwords. The user does not (need to) know what other permutations exist: they just type in what they typed in before and it works. More precisely: in so far as the issue exists it is no different from users having to struggle with different keyboard layouts/layout conventions between different devices/OSes. A silly thing which makes life needlessly complicated, sure, but also not something for software to try and second-guess. Better to be consistently silly (predictable behaviour for the user) than to be inconsistent in a silly fashion (unpredictably and frustratingly inconsistent when least expected).

Collation is relevant for things such as sorting, comparison and case shifting; none of which are operations you should ever even contemplate doing with passwords.

Put another way: worrying about supporting collation in passwords is seeking out complexity where it’s not needed.



unless typing it in on a different operating system generates a different code point, which I haven’t seen. You may be assuming that Mac, Windows, and Linux all generate the same codepoint. I think they don’t, but I haven’t found a citation for that.

1 Like


This isn’t necessarily better. It can be a drawback when you have some kind of account that spans multiple domains or changing domains, then you have to remember which domain name you used to generate the password. E.g. my credentials at work are used for my NT login, my account on our private code repository, my JIRA account, my online payroll account, our third-party federated login used to access a bunch of different resources and several other sites all with different domain names. They are all integrated with ActiveDirectory so there is no way to separate them.

On certain kinds of sites like bank and credit card websites, you can get some pretty idiosyncratic uses of subdomains that don’t even necessarily remain static over the longer term. Of course you might try to have a policy of only ever taking the second-level domain to avoid some of those subdomain issues, which might seem like it works until you use a site that uses sub-domains that should not share a password, then you have to remember for which sites you have to ignore the subdomain and which sites you should include it.



The most annoying thing is that whoever sets password rules for each site seems to think only their site exists (instead of being part of, y’know, the internet) and that each user will take the trouble to create a unique and completely distinct password for their site, even though the reality is that people might have something like 50 sites they need passwords to and most human brains can’t remember that many password/site pairings.



Well… tell that to Microsoft… here is a screenshot from Azure reset password page

1 Like


I was previously using a long complicated password, but using the same password everywhere…
I got hacked in a few services because of this(probably because some services with bad practices got hacked).

Now, I just use LastPass to generate random 32 characters passwords(with special characters, numbers…).
And for LastPass I have an 18+ characters long password that tells in my native language, a story from my life(first characters of the words, numbers, special characters…) + 2FA(my only worry with 2FA is that the phone can be stolen or I could lose it).

One problem now is that some sites don’t like special characters or long passwords and sometimes I have to limit the generator to 16 characters.

All that, I prefer when the websites offer me the possibility to login with Google, I have a good password there and 2FA.

1 Like


Those rules don’t inspire confidence — I think if you see a detailed set of rules that might be an artefact of:

  • Someone in a basement just hacked together his own password hashing function. That would be my №1 suspect with length restrictions

  • Some characters causes their application to seize up because they don’t now how to escape properly / use prepared statements / or whatever. I’ve seen a couple applications crash with internal errors when you use certain characters in passwords.



they just type in what they typed in before and it works.

Um no, that’s exactly what people are pointing out. Read it. If you type á you may get [U+00E1] or you may get [U+0061, U+0301]. Or consider the difference between μ and µ (that’s U+03BC and U+00B5 respectively). You will not tell that difference on a keyboard. The AZERTY layout include µ, want to guess which one that is?

Comparing strings for equality is collation.

Maybe it will work if passwords are always normalized to a certain normalisation form before hashing. Whether to use Compatibility or Canonical form will be a long discussion.




Actually this hard for non-tech users who always find themselves like storm everyday in your IT Dept.
I wish they have already invented password logging behavior algorithm, if you forgot password and your press the button and tell to voice recognition algorithm “I don’t know” the voice recognition will ask you security question “where did you hide it?” :wink:



It definitely is better than 4 letter password, not arguing that, but still predictable enough. I think standard “dictionary” used by most people would be much less that 50k words, probably more like 5k (or even less). If everybody uses passwords like this, all hackers need are pre-calculated tables and you “crack” hundreds of passwords in seconds.
OTOH, if you add unpredictable and unique “entropy”, like replacing all the "a”s with, let’s say “p”s, your password gets much more secure as it’s unique and no pre-calculation helps with cracking it.



Where did you get the 100k common password list from? Are such things commonly available? Is there a common API for such things?