Pepper#

ToDo:

  • Add illustration for the salting process - Similar to this.

  • Add reference to libraries compatible with peppers.

  • Add appendix regarding how to store environment variables in .env files.

  • Add appendix regarding read/write hashes to database.

  • Add relevant resources at the end


Pepper is the name given to an array of random bytes added to the data before hashing it, it is analogous to salts with some key differences:

  • Peppers are secret

  • Peppers should be store in the different medium than salts, or not stored altogether

The remaining vulnerability with salts is that an attacker can still run parallel attacks and try to find weak passwords, since the salt is public it can use a powerful GPU or some specialized hardware to test for commonly used password.

With Pepper, even if the database is compromised, i.e. all hashes and salts are leaked, the attacker cannot parallelize the hash cracking process because to that it needs the salt (which was leaked), the hashed password (which was also leaked) and the pepper (which they do not have). That means that they are not able to test even for weak passwords in parallel.

That means that even with weak passwords, pepper protects the hashes with yet another layer of security.

There are two approaches to peppers:

  • Secret Pepper: here the pepper is stored somewhere but physically separated from salts and hashes, like TPM circuit, environment variables, or managed services.

  • Re-discovered Pepper: here the pepper is not stored and the system should re-discover it every time.

Secret Peppers are fast and don’t have major impact on performance, but they require an external medium to store them, moreover, if they are leaked, the security is the same as it would have been with salts alone. On the other hand, re-discovered peppers can be used without any storage but they penalized performance because of the actual search, they cannot be leaked because the system do not know their values and a different one is used each time.

When using the secret approach, the pepper is usually quite long whereas when using the re-discovered approach the pepper is usually quite short to minimize the performance penalty.

An example for each is provided below

Caveats#

Even though the algorithm used in this chapter is scrypt, it should be noted that it was not thought for handling peppers natively. There is no other algorithm available in the standard library so be mindful and only consider this code for educational purposes.

Examples#

Common code#

import secrets
import hashlib
# https://en.wikipedia.org/wiki/List_of_the_most_common_passwords
most_common_passwords = {
    "123456", "123456789", "picture1", "password",
    "12345678", "111111", "123123", "12345", "1234567890", 
    "senha", "1234567", "qwerty", "abc123", "Million2", 
    "000000", "1234", "iloveyou", "aaron431", 
    "password1", "qqww1122"
}

def cracking_password(database_hashed):
    salt, hashed_password = database_hashed.split(":")
    salt_bytes = bytes.fromhex(salt)
    
    for guess in most_common_passwords:
        guess_bytes = guess.encode("utf-8")
        hashed_guess = hashlib.scrypt(guess_bytes, salt=salt_bytes, n=64, r=8, p=1).hex()
        if hashed_password == hashed_guess:
            return f"Password Cracked: '{guess}'"
    
    return "Password not found in database"


def password_generator_salt(password):
    password_bytes = password.encode("utf-8")
    salt = secrets.token_bytes(32)
    hashed_password = hashlib.scrypt(password_bytes, salt=salt, n=64, r=8, p=1).hex()
    return f"{salt.hex()}:{hashed_password}"
user_database = {
    "John": password_generator_salt("abc123")
}
user_database
{'John': 'bf1eb12612bea7f28f0bb003d5c3317cc8cd48830fb1055a66d637d6daa81b92:0b66356b7b98b094b440bd16d657cc983ee423de032475d780af9e7339e92f5f6ba5ba57ed713a461d76127db7bb79c430d5204a1910437586d65d5b0227b420'}

Without using Pepper#

leaked_password = user_database["John"]

cracking_password(leaked_password)
"Password Cracked: 'abc123'"

Secret Pepper Example#

# At an earlier moment in time
import os

# Environment variables are not part of the database
os.environ["PEPPER"] = secrets.token_hex(32)

def password_generator_salt_pepper_secret(password):
    password_bytes = password.encode("utf-8")
    salt = secrets.token_bytes(32)
    
    pepper = os.environ["PEPPER"]
    pepper_bytes = bytes.fromhex(pepper)

    new_salt = salt + pepper_bytes
    
    hashed_password = hashlib.scrypt(password_bytes, salt=new_salt, n=64, r=8, p=1).hex()
    return f"{salt.hex()}:{hashed_password}"   

def check_password_secret(user, password):
    password_bytes = password.encode("utf-8")
    
    salt, hashed_password = user_database[user].split(":")
    salt_bytes = bytes.fromhex(salt)
    
    pepper = os.environ["PEPPER"]
    pepper_bytes = bytes.fromhex(pepper)

    new_salt = salt_bytes + pepper_bytes
    
    hashed_trial = hashlib.scrypt(password_bytes, salt=new_salt, n=64, r=8, p=1).hex()
    if secrets.compare_digest(hashed_password, hashed_trial):
        return "Access Granted"
    
    return "Access Denied"

user_database = {
    "John": password_generator_salt_pepper_secret("abc123")
}

user_database
{'John': 'f36532aabd092dd7b7f6f3a3047ea4087e5256675ac439b9fdaa4d5ab34f4166:11e6b10c48d1a8f9f5ffe940a0e24d10ba1f08cea8777bcbb8c633dcd3a1e2b36c7ad1942abadb5406103370859dfe0f667ef495dc52c70b6bfe05d7509facbe'}
leaked_password = user_database["John"]

print(cracking_password(leaked_password))
Password not found in database
print(check_password_secret("John", "abc123"))
Access Granted
%%timeit 
check_password_secret("John", "abc123")
260 µs ± 116 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

Self-Discovered Pepper Example#

pepper_bitslength = 8
pepper_bytelenght = round(pepper_bitslength/8 + 0.5)

def password_generator_salt_pepper_discovered(password):
    password_bytes = password.encode("utf-8")
    salt = secrets.token_bytes(32)

    pepper = secrets.randbits(pepper_bitslength)
    pepper_bytes = int.to_bytes(pepper, pepper_bytelenght, "big")

    new_salt = salt + pepper_bytes
    
    hashed_password = hashlib.scrypt(password_bytes, salt=new_salt, n=64, r=8, p=1).hex()
    return f"{salt.hex()}:{hashed_password}"   

def check_password_rediscovered(user, password):
    password_bytes = password.encode("utf-8")
    
    salt, hashed_password = user_database[user].split(":")
    salt_bytes = bytes.fromhex(salt)
    
    for guess_pepper in range(2**pepper_bitslength):
        guess_pepper_bytes = int.to_bytes(guess_pepper, pepper_bytelenght, "big")

        new_salt = salt_bytes + guess_pepper_bytes

        hashed_trial = hashlib.scrypt(password_bytes, salt=new_salt, n=64, r=8, p=1).hex()
        if secrets.compare_digest(hashed_password, hashed_trial):
            return "Access Granted"
    
    return "Access Denied"

user_database = {
    "John": password_generator_salt_pepper_discovered("abc123")
}

user_database
{'John': '84f667cda66c0e72a60c0ceed8fc45a5cd334a6d77487b32d92f6b2b224f6d9e:70da7e70265fd709f9221228980d8ac1a02109c8520c609881c5a42de5f973446521f3c84d81ac5a85f4ce421944e679df6d77ab2e1b66d84cf9547a4c0919ae'}
leaked_password = user_database["John"]

print(cracking_password(leaked_password))
Password not found in database
print(check_password_rediscovered("John", "abc123"))
Access Granted
%%timeit
check_password_rediscovered("John", "abc123")
29.2 ms ± 12.8 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

Conclusion#

Salts introduced a first layer of security on top of hashes but the hash + salt strategy is still vulnerable to parallelized brute-force attacks. Introducing a new secret sequence of bytes, called pepper, solves this issue. Peppers should be stored in a physically separated storage or not stored at all.