The Pure Crypto Project
Remarks on Security

Attacks on the Cipher RSA

With the usual notation PCP relies on a cryptographic function (ModExp) that transforms a message M in the following way:

{{ M } Ka } Ka-1 = M

where Ka is Alice's public key and Ka-1 is her corresponding secret key. This scheme can elegantly be used to create a signature by applying the keys in the reverse order.

Key Length

One of the obvious problems follows from the fact that keys must have a sufficient length, which means effectively that it is computationally infeasible for an opponent to deduce the secret key from the public key, i.e to factor the modulus (or use a shortcut unknown to the crypto crowd). The history of factoring shows that it is possible to factor moduli of 512 bit and recent rumours related to Bernstein's work and attempts to design fast factoring hardware nourish the expectation of further improvements, so that there has to be a limit of modulus length which every key must meet to be used in PCP. This limit currently is set to 1024 bit. PCP will reject every key which is shorter. Of course there are different keys for encryption and signing, so the long-term signing key can be considerably longer than the current encryption key.

Multiplicative Homomorphism

A bunch of other problems arise from the so called "multiplicative homomorphism" which is a property of the core-encryption function ModExp. If you create a message M3 by multiplying M1 and M2 - very few will make sense, but if you are lucky - you can create ciphertext C3 by multiplying C1 and C2. You may think that this is not a problem, because creating ciphertext would only require the use of public keys and thus would be possible for everyone for any plaintext. But as the same property holds for signatures an opponent can create new signatures that look as if having been created by other persons by simply multiplying two (or more) old signatures he has got. This can be prevented by hashing the message before signing because multiplication of hash values would produce a different value than the multiplication of the plaintext messages before hashing (given that the round function of the hash is multiplication free).

That arises the question how ModExp is used in PCP to create a signature. usually this can be answered by giving the protocol for signature creation and verification.

Preconditions
A meets CA : A , Ka = (n1, e1, A, hash(n1+e1+A))
CA meets A : CA, KCA = (n0, e0, CA, hash(n0+e0+CA))
B meets CA : B , Kb = (n2, e2, B, hash(n2+e2+B))
CA meets B : CA, KCA = (n0, e0, CA, hash(n0+e0+CA))
CA verifies the security hash on Ka and Kb
CA ----> A,B :  Ka, CA, { T1, CA, hash(Ka) } KCA-1
CA ----> B,A :  Kb, CA, { T2, CA, hash(Kb) } KCA-1
A and B verify the CA's signature and sign the keys themselves.

Signing by Alice A ----> B : Signature = ( M, A, { T3, A, hash(M) } Ka-1 )

Verification by Bob Signature is good, if hash(M) corresponds with { { T3, A, hash(M) } Ka-1 } Ka = T3, A, hash(M)

In case Alice and Bob cannot meet directly to exchange their public keys Ka and Kb they usually rely on the service of a third person ( the CA ) in which both of them have a certain degree of trust in, that this third person will check the key holder's identity as well as her ability to use the corresponding secret key and so on. The keys which are presented to the CA consist of the modulus, the public encryption value and the user's identification string (only this is a valid key) together with the hash of these three values. Note, that changing the user's identification will show up in the security hash! Thus the security hash protects the key material and simultaneously the user's name from manipulations even if the key is unsigned. After the CA has verified the security hash of Ka and Kb with Alice's and Bob's written details it sends both keys back to their owners and publishes them in a public directory.

Alice creates a signature on her message M by hashing the message and encrypting a timestamp T3, her name A and the hash of the message with her secret key sending this piece of crypto together with her name and the original message to Bob.

Before Bob can verify Alice's signature, he has to verify the CA's signature to make sure that he can use Alice's key in PCP which is only possible after he has signed this key with his own signing key. After that he decrypts the piece of crypto inside Alice's signature with her trusted public key Ka which reveals the hash value Alice has created and compares it to his own hash of the message M in the signature. If they don't match, either the message M is altered or replaced by an opponent or the crypted piece of information is replaced by a different one which had been created with Alice's secret key long ago, but using a different message. Or the signature is completely forged which is signalled in PCP as a bad signature.

The multiplicative homomorphism of RSA leads to another problem which comes into play when you encrypt messages. As I understand a method called Optimal Asymmetric Encryption Padding (OAEP) is used to protect encrypted plaintext from being forged and this usually applies only to short pieces of data like key packets. Extending OAEP to all encrypted plaintext would be roughly equivalent to signing the message, because a hash of the whole message body is needed in OAEP, so the additional step of encrypting the hash would not be expensive. But of course there are other reasons to introduce an Asymmetric Encryption Padding Scheme because without a carefully crafted manipulation of the plaintext the RSA encrypted blocks are exposed to a variety of attacks, which will be described before PCP's padding scheme will be analysed.

Key related Problems

Common Modulus Attack

As PCP stores public keys used for encryption in the directory "trusted-keys" two different keys may have the same modulus by chance. This would make encrypting a message using either key insecure, because each of the different key owners can factor the common modulus using her own secret key and thus is able to deduce the other person's secret key from her public key effectively. Checking the keys used by PCP for common moduli is a must every time a new key enters the database.

Key spoofing Attacks

Some carefully crafted key spoofing attacks have been described in detail by Ross Anderson and Roger Needham (see [AN-95a], [AN-95b]). Especially when users try to sign something that contains already encrypted material an opponent who is able to factor the modulus used for encryption (i.e. at least Bob) can work out a different Message M' and publish a modified public key for Bob so that the new Message M' looks like a valid signature made by Alice on something that was encrypted for Bob. (see [AN-95a] p.2 and [Bon] p.4 for details)

To prevent this kind of attack PCP checks every message the user is about to sign whether it contains lines of cryptographic material - which is everything made out of 0..9 on a line by itself - and issues an unequivocal warning that the user should not proceed with the signing process unless he knows what he is doing. The warning makes clear that it always is prudent first to sign before encrypting something.

Another form of key spoofing attacks require compatible weak keys (see [And-2000] p.4 for details) that will work if an attacker is able to modify the modulus in a public key published in an unprotected place. This will show up in PCP because before any key is used the security hash is printed, so that the vigiliant user will notice that the hash has changed. If the manipulation has occured before the user signs the key and the correct unmanipulated security hash is on the key and has been checked with the user (or a CA) then PCP will notice that the signed key material gives a different security hash than the one which is signed. This will lead to another unequivocal warning not to use the key before it has been checked with first-hand knowledge. But if the security hash is replaced as well - a daring attack - everything looks fine, except that the length of the modulus is considerably longer than the original and this will be calculated and displayed to the user as well before the key is used. So if you know that your mate uses a 2048 bit signing key and the length of the public key shows more when it is used to verify, you know that something is rotten with this key.

Low Public Encryption Exponent Attack

There is a serious problem arising from the fact that most public keys commonly used have a very small public encryption exponent e as part of the public key (e,N) which makes attacks based on Coppersmith's theorem become practical. Basically his theorem provides the means to find small roots of polynomials modulus a composite N. An attack will require, that the same message M is encrypted with a number of different public keys. If for instance the public encryption exponent is 17, which is a number used in very many public keys today, an attacker can obtain the plaintext after he has collected more than 17 different ciphertexts of the same message and has used the method to compute 17th roots based on Coppersmith's theorem effectively. This attack can be defeated by carefully padding the message before it is encrypted with the RSA primitive but only, if some randomness is introcuced in the process of encryption. In what way this will work inside PCP will be explained later.

Random Fault Attacks

If signature verification fails PCP will report this fact to the user without revealing details of the mismatch between the freshly made hash of the message and the decrypted material in the signature. On the other hand, if a signature is created and the secret key is being used for encryption some randomly occuring dysfunction may affect the created signature so that at least one single bit may be wrong. To prevent any possible exploitation of those random faults (see [Bon] p.13) PCP will always check, if a signature verifies correctly before it is sent out to the recipient.

Oracle Attacks

Recently oracle attacks have become very popular (see [JKS] and [Bon] p.3) and they pose a real threat to the RSA encryption scheme, when being used without proper padding of the plaintext. Generally speaking, if any process of decryption involving the secret key leads to incorrect plaintext, i.e. if the decryption fails, the decrypted material must be discarded reliably. There should never be a way to access this faulty crypto, not even by the user himself, because if the opponent can learn whether a decryption was correct or not he can feed the user with crafted ciphertext and abuse him as an oracle. This may put him into a position to recover an encrypted message once he has got the faulty decrypted material. Therefore PCP will never return any output of an unsuccessful use of the secret key and the time for unsuccessful decryption will be exactly the same as for successful decryption to prevent timing attacks.

Oracle attacks will work in a way similar to blinding attacks on signatures (see [Bon] p.4 and 14) An attacker uses a random number r and computes C'=C*r mod N and sends C' for decryption to the victim. If the attacker knows that the plaintext will have a certain known structure, he can deduce information about the plaintext from the fact whether or not the victim accepts the message as a valid plaintext.

Hash-Chain Encryption Padding

Due to a number of possible attacks we cannot use RSA on the plaintext blocks directly and have to provide a padding method that introduces randomness in addition to every plaintext block before encryption is done. It is very much desirable that the padding method will provide a means to check, whether a block can be decrypted successfully or not. The method used for padding relies on the secure transmission of an initial number of 256 bit which will subsequently be used to generate a hash-chain that provides a piece of information to blind the current message block. Additionally the hash-chain gives a challenge, that links consecutive message blocks so that chosen ciphertext attacks can be detected.
  1. Use standard OAEP to transfer a single Nonce of 256 bit in the first crypto packet to the recipient.
    The recipient decodes the first block and learns the initial Nonce.

  2. Prepare a hash-chain for the next block
    PAD1 = initial Nonce
    PAD2 = hash(PAD1)
    CHALLENGE = PAD1 XOR PAD2

  3. Create a long number to be encrypted with the RSA primitive as follows:
    Concatenate two numbers, the 256 bit CALLENGE and the padded message with PAD1 extended to the whole length of the message.
    i-th Block : Message XOR (repeated PAD1) || CHALLENGE

  4. Prepare the hash-chain for Block i+1:
    PAD1 = PAD2
    PAD2 = hash(PAD2)
    CHALLENGE = PAD1 XOR PAD2

  5. Continue at 3) until all plaintext is processed.
The recipient can form the same hash-chain during decryption and can recover the message with PAD1. And she can check, if the current CHALLENGE is the XOR of both pads at stage i. If not, an oracle attack is imminent and anyway integrity is violated. In this case PCP now continues to decrypt (to prevent timing attacks) and throws away the decrypted material leaving a warning in the output file, that an oracle attack may have been launched and the user will never see the decrypted material.

The attacker will only be able to replace a crypted block by another one, if she can manage to keep the lower 256 bits inside the encrypted packet the same. Otherwise the hash-chain is broken and the decrypted material will disappear forever. So she has to fix the lower 256 bits (the CHALLENGE) of the new forged plaintext encrypted under Bob's public key. And the value of the fixed part of the plaintext is one she doesn't know unless she knows the initial Nonce in the OAEP packet. If somehow she gains knowledge of the plaintext at stage i she will know PAD-i XOR hash(PAD-i) and cannot recover the padded message. And what is more, she will not learn about PAD-i from the observation of the next packet because XORing the challenges will give only knowledge of PAD-i XOR hash(hash(PAD-i)) and so on.

Hashing within PCP

The hash function which was originally designed for use with PCP received a highly reserved attention and the opinion was expressed, that nothing new should be used but the standard hash functions which are supposed to have been subject to intense scrutiny and review. On the one hand this is surely an advantage of standard algorithms like SHA-1 but on the other hand, the underlying idea and the resulting code of SHA-1 is not at all self-evident nor can it be regarded as pure crypto because of the complexity of its operations and its impenetrability beeing caused.

Fortunately the discussion brought a discrete logarithm hash functon (SDLH) to light which was proposed years ago by Adi Shamir, and which is not only based on an intuitive simple idea but also comes with a prove of collision-resistance given by Ronald L. Rivest in a posting to the cryptography discussion forum.

Consequently PCP was redesigned to run in two different modes. The "conservative mode", being the default mode, which addressed the concerns of those who will only trust standard algorithms uses SHA-1 as the hash algorithm under all circumstances. Clearly this is a compromise solution because at this point the program relies on a precompiled module implementing SHA-1 breaking with the idea of pure crypto which should entirely be understandable by the user.

The second and "pure mode" will use Shamir's discrete logarithm hash function which will be used with moduli longer than 1024 bit, so that the hash values used in signatures will be that long as well.

Shamir's discrete logarithm hash function (SDLH)

The SDLH is base on a simple idea that once the message is converted into a long integer a hash of the message can be computed as follows:
hash(x) = g x (mod p*q)
given, that both p and q are large primes which are being kept secret so that factoring n = p*q is computationally infeasible.

This hash function is provably collision-resistant, I quote the prove Ronald L. Rivest presented in his posting:

Adi Shamir once proposed the following hash function: Let n = p*q be the product of two large primes, such that factoring n is believed to be infeasible. Let g be an element of maximum order in Z_n^* (i.e. an element of order lambda(n) = lcm(p-1,q-1)). Assume that n and g are fixed and public; p and q are secret. Let x be an input to be hashed, interpreted as a non-negative integer. (Of arbitrary length; this may be considerably larger than n.) Define hash(x) = g^x (mod n). Then this hash function is provably collision-resistant, since the ability to find a collision means that you have an x and an x' such that hash(x) = hash(x') which implies that x - x' = k * lambda(n) for some k. That is a collision implies that you can find a multiple of lambda(n). Being able to find a multiple of lambda(n) means that you can factor n. I would suggest this meets the specs of your query above. Cheers, Ron Rivest Ronald L. Rivest Room 324, 200 Technology Square, Cambridge MA 02139 Tel 617-253-5880, Fax 617-258-9738, Email <rivest@mit.edu>
There are a number of issues to be addressed, especially when the SDLH is being used together with the RSA signature scheme and a full analysis of the SDLH's security can be found in the paper "A Discrete Logarithm Hash Function for RSA Signatures". The analysis shows, that SDLH can safely be used together with RSA once certain conditions are met with regard to the selection of the user's key material. For details I like to refer to the paper about SDLH.

The Protection of Your Secret Keys

The intention to keep the source code of PCP as small as possible is not the only reason why there is no symmetric cipher in PCP. At least when it comes to protecting your secret keys with a passhrase some symmetric cipher is needed. So why shouldn't we use this method instead of the painfully slow encryption mechanism with RSA keys? The answer is simple, if you can generate a truly random sequence of data which depends on your passphrase alone you can use this sequence as a one-time-pad to protect your secrete key, because this one-time-pad never has to leave your local system. This method will be secure as long as it is computationally infeasible to recover the sequence of random data used to protect your secret key.

But the same method which is suitable to protect your secret key cannot be used to encrypt a message that is meant to be sent to another person for practical reasons which are generally known and accepted. So the intuitive and secure OTP-method can be used locally only without the need to transfer the OTP to another party securely.

Let me first describe in detail the secret decryption value d is protected on your local computer system before I will discuss the reasons against the plan to engage this method as a general symmetric substitution for the RSA-encryption of plaintext.

The user's passphrase string is read in from the console and then hashed into a 256 bit number (using the secret key's modulus), the result of 78 decimal digits is then split into 13 pieces consisting of 6 decimal digits each, providing 13 pointers into a file which consists of at least 1100000 bytes of truly random data. Each pointer provides a different sequence of random data with the length of the modulus. All 13 random sequences are then XORed and the resulting sequence is finally XORecd with the secure decryption value d and stored in the file that holds the secret key. When the secret key has to be used for signing or decryption the passphrase is read in again and the 13 piece one-time-pad is reconstructed to unlock the stored encryption value.

The sequence of random data is clearly used as a one-time-pad because it will never be reused to encrypt anything else but the secret key and if the hash function does not cluster, every possible pointer is equally likely, depending only on the passphrase used, which must contain as much entropy as the pointers. To recover the pad one has to recover all 13 pointers used, i.e. one has to guess the passphrase, if the hash is a one-way function. A brute force attack on the secret key encryption would roughly require to check all possible combinations of 13 sequences out of a million, which is roughly proportional to enumerating all 256 bit hash values. Usually it seems much more rewarding to make assumptions on the passphrase and to try dictionary attacks instead of mounting a fully fledged brute force attack, hoping that the user has a weak passphrase. But if the passphrase has enough entropy the strength of this encryption scheme is almost 256 bit, provided that the entropy file is made out of truly random data, which is not an easy job to do.

Ferguson, Schneier and Wagner (see [ FSW ]) published their cryptoanalytic research on a randomized stream cipher implemented by TriStrata which was designed as a symmetric encryption algorithm also using a 1 MByte random data pool but only 5 instead of 13 pointers to generate a keystream for encryption. They analysed this smaller variation of an encryption scheme first introduced by Maurer ([Mau-92]) which promised to be both provably secure and at the same time practical using a very huge amount of publicly available random data with 50 pointers. In their analysis Ferguson et. al. showed that serveral attacks against the smaller Maurer-like variant of the cipher are possible with a complexity of 293 for an exhaustive search attack on the pointer space and even 257 when a meet-in-the-middle attack is being mounted. They also showed that improved attacks with a complexity of only 239 are possible if one could induce faults into the random pool or if one could find related keys whose contributions to the keystream cancel out. They concluded that if the number of pointers used is increased to 14 the complexity of an attack would rise to 2128 but with only 5 pointers the TriStrata system will not always be secure.

All attacks against the TriStrata variant are based on two assumptions which are realistic for a symmetric cipher. Firstly, that the random pool is publicly known, a fact we can suppose for PCP's entropy file as well, even if it is protected by the OS's protection and ACL mechanisms and secondly, that the stream of encrypted data can be observed freely by the attacker who knows some bytes of the plaintext because of some known structures of certain plaintexts like WORD-documents and the like. Ferguson et. al. suppose that the adversary knows at least 16 bytes of plaintext revealing 16 bytes of the keystream. But this assumption of known plaintext is unrealistic in the context of PCP's protection of the secret key as the secret key does not only lack a known structure but also even if a fraction of the secret key is known there are different ways to reconstruct the rest of the secret decryption value (see [Bon] p 11) and the protection is already broken. Thus the special purpose the Maurer-like encryption scheme is used for in PCP to protect the secret key will render the known-plaintext attacks described by Ferguson et. al. useless.

But at the same time their cryptanalysis shows that such an encryption scheme cannot safely be used as a general substitution for RSA encryption inside PCP to speed up its performance. As long as the random pool and the number of pointers are not increased considerably randomized stream ciphers a la Maurer cannot be regarded as a secure cipher suitable for the exchange of strongly encrypted messages.

But anyway there is one aspect of Ferguson's analysis which may affect even PCP's protection of the secret key. If the attacker is able to tamper with the stored pool and the adversary can observe enough ciphertext encrypted with the manipulated random pool she might locate the position of the pointers (see [FSW] section 5.2). But in the context of PCP's protection mechanism the attacker will not be able to observe the faulty ciphertext, because PCP checks if the secret decryption value works correctly with the public key, and if not, it will display a warning that the secret key is unavailable and will simply stop working. On the other hand as long as the attacker constantly keeps changing the random pool and at the same time observes that Alice is happily signing her files she knows that the manipulated bytes of the pool are not inside the 13 sections which are used to create the one-time-pad. But if she hits a significant byte Alice will notice that her secret key does no longer work and if she is careful, she will create a different entropy file to protect her secret key taken from the safe again bringing the attacker back to square one.

Finally I would like to stress that the quality of the entropy file is of utmost importance for the security of the secret key so that it is prudent to spend as much care as possible to select the 1100000 Bytes of the random pool using a truly random source of entropy.

References

[ AN-95a ]
Ross Anderson and Roger Needham: Robustness principles for public key protocols.
in: Advances in Cryptology - CRYPTO '95, Springer LNCS v 1070 pp10-18

[ AN-95b ]
Ross Anderson and Roger Needham: Programming Satan's Computer
in: Computer Science Today - Recent Trends and Developments , J. van Leeuven (ed.), Springer LNCS vol 1000, pp 426-440

[ And-2000 ]
Ross Anderson : Two Remarks on Public Key Cryptography

[ And-93 ]
Ross Anderson : The Classification of Hash Functions
in: Codes and Ciphers (proceedings of fourth IMA Conference on Cryptography and Coding, December 1093), published by IMA (1995) pp 83-93

[ Bon ]
Boneh, Dan: Twenty Years of Attacks on the RSA Cryptosystem

[ FSW ]
Niels Ferguson, Bruce Schneier and David Wagner : Security Weaknesses in Maurer-like Randomized Stream Ciphers

[ JKS ]
Kahil Jallad, Jonathan Katz, Bruce Schneier : Implementation of Chosen-Ciphertext-Attacks against PGP and GnuPG

[ Mau-92 ]
Ueli M. Maurer : Conditionally-Perfect Secrecy and a Provably-Secure Randomized Cipher
Journal of Cryptology, vol 5 no. 1, 1992 , pp. 53-66

[ S-03 ]
Ralf Senderek : A Discrete Logarithm Hash Function for RSA Signatures, 2003