Password Storage - >_ Unvalidated input

Passwords in cleartext

Storing cleartext passwords is a bad idea. Passwords should be irreversible when in storage. When passwords are stored as original cleartext, anyone who has access to the underlying storage system will also get access to all the accounts. Cryptographic hashes should be used to protect such sensitive authentication data at rest.

Hashing

The process of hashing takes the original password and transforms it using a one-way function to data of a fixed size. A few different types of hashing are out there. The most common are MD5, SHA-1, SHA-256. In recent years a number of known cryptographic weaknesses have been identified making some of these algorithms unusable as a message digest. However, not all of these apply in the context of password hashing. The main problem being these algorithms are too fast. You can address this by implementing iterative hashing, where you hash the password multiple time. You should pick the number of iterations based on time. I suggest a minimum of 200 ms for the function to complete.

Rather than building a bespoke iterative hashing, it’s a much better idea to use a hashing method that is considered a de facto standard for password storage. There are many robust algorithms out there that come with configurable work factor. To name the most popular here:

Bcrypt
PBKDF2
Scrypt
Argon2

Argon2 is the latest and was selected as the winner of the Password Hashing Competition in July 2015. This should be the default choice for any new application.

Storage format

The evolution of parallel computing enables new and faster attacks against password hashing. You should carefully choose the format used to store hashes. Cryptographic algorithms change over time. Even if an algorithm is still secure, you should increase the work factor every year to keep up with Moore’s law.

Here is an example of a format used by GNU C library:

$id$rounds=N$salt$hash

where $id is one of:

id  | Method
─────────────────────────────────────────────────────────
1   | MD5
2   | Blowfish
5   | SHA-256
6   | SHA-512

Example:

$6$rounds=6000$gFakm/qCJ77dPS.I$E5FDu2k7zTeehOC2uZ1AsUGqjO1G4Fdn8Lv0sK8iAw86gER7hPRxjDFayVBTW6inT4mlFpfaE/W7fz9jVXkqR/

The rounds=N is the number of hashing rounds actually used. The $salt stands for the up to 16 characters of random salt. The $hash part is the actual hashed password using the $id algorithm. The length of the hash depends on the algorithm used.

This is just an example and you can come up with your own scheme. However, extending the existing one with new algorithms like Argon2, PBKDF and assigning them a new $id will make things more clear. It will be also much easier to understand for anyone familiar with this storage format.

Hashes are often stored in a database. It makes sense to store other metadata about the account too:

Last password change time and date
Last login time and date
Active / inactive / locked flag

Future proofing

As the time passes you will end up in a situation where the existing password storage solution becomes cryptographically outdated and must be updated either by changing the work factor or the algorithm used. You should built-in the ability to upgrade in place without adversely affecting existing user accounts right from the start. The best approach is to upgrade it upon successful authentication. When a user attempts to log in, the application can hash the password using the old (weak) and the new (secure) algorithm. If the old hash matches the existing database records, then the new stronger hash is stored in the database replacing the old, weak one. With the additional metadata, you can measure the uptake and decide what to do with users who haven’t logged in since the change was made. An email asking users to log in back is a good idea. It makes sense to lock the account if it hasn’t been used for some time, e.g. one year.

Password reuse

Two in three people reuse the same password for multiple accounts. If a password is compromised elsewhere, it can be correlated by username or email address to other services using the same password, thus propagating the threat. It makes sense for any modern application to check the user passwords against existing data breaches. The Pwned Passwords is such a database that is available for both offline and online use. The public Have I Been Pwned API uses a k-anonymity model where the password can be hashed client-side with the SHA-1 algorithm, and only the first 5 characters of the hash are shared. Alternatively, you can download the entire Pwned Password list, upload it to a database and create a local service. Unlike the public API, the local service will require periodic updates to ensure it contains the latest breached hashes.

Go passwordless

Adopting a risk avoidance strategy for password storage may be an option for modern applications. This is only possible if you eliminate risk by withdrawing or not becoming involved in the most high-risk activity of user authentication. One such option is to implement Social Login. From the user’s perspective, it provides a frictionless way to login; it also eliminates the need to store and process passwords by the application. There are pros on cons of using social login. I’m not going to discuss it here but just to mention a few like the loss of control to a third party, privacy concerns and long-lived access tokens that may not be suitable for all applications.

Another approach is to provide a one time password (OTP) using an out-of-band channel (e.g. push notification, email, SMS) upon login. This method is as good as the security of the out-of-band channel. Also, the OTP can sometimes be delayed by hours depending on the delivery mechanism used. This may not be acceptable in some scenarios.

Finally, you can trade the password storage for cryptographic material storage. WebAuthn (Web Authentication) is a new emerging standard for authenticating users to web-based applications and services using public-key cryptography. A public key and randomly generated credential ID must be still stored. However, even if exposed the risk is minimal, public keys by designed to be openly shared. Additionally, due to a much larger keyspace, compared to average password complexity, public keys are considered infeasible to brute force at this time (keys 2048 bit and larger). All major browsers are adopting WebAuthn on both mobile and desktop. Once it becomes part of the core iOS and Android platform will mean that devices like a phone can provide us with biometric verification and WebAuthn will slowly replace traditional passwords. Watch this space closely.