From: reinhold@world.std.com (Arnold G Reinhold)
Newsgroups: sci.crypt.research
Subject: Results of PGP Pass Phrase Survey (Repost)
Date: 3 Jun 1995 02:44:17 GMT
Organization: The World Public Access UNIX, Brookline, MA
Message-ID: <3qoia1$e8s@net.auckland.ac.nz>
Reply-To: reinhold@world.std.com (Arnold G Reinhold)
-----BEGIN PGP SIGNED MESSAGE-----
Arnold G. Reinhold
Cambridge, Massachusetts, USA
May 19, 1996
Revised June 1, 1995
Pass phrase management is arguably one of the weakest links in the PGP security chain. To gather some facts on actual pass phrase usage, I recently conducted a survey over the Internet.
The survey questionnaire was posted to usenet four times: three times on alt.security.pgp, on March 10, 26 and April 13 and once on sci.crypt on April 24. A total of 46 responses were received, the last on May 7. One respondent declined to answer the pass phrase questions and was excluded.
This is not an ideal survey in that the sample size is small and all the respondents are self-selected, but it is the best method for gathering some real data that I could come up with. Thanks to all those who took the trouble to respond.
| Platform | Responses |
|---|---|
| MS-DOS | 24 |
| Windows | 12 |
| Macintosh | 8 |
| Unix (multi-user) | 9 |
| Unix (standalone) | 6 |
| OS/2 | 4 |
| Amiga | 2 |
| Atari | 1 |
Eleven responders indicated two platforms and three indicated three platforms. Most Windows users also indicated MS-DOS.
The number of responders who indicated multi-user Unix is a cause for concern. Multi-user Unix is generally considered an undesirable platform for PGP uses since it is so lacking in security. Six of the nine responders who indicated multi-user Unix indicated a personal computer platform as well. The wording of my question does not exclude the use of multi-user Unix systems for encryption only, which is less of a risk.
| Frequency of Use | Responses |
|---|---|
| Rarely | 5 |
| Once a month | 9 |
| Once a week | 6 |
| Several times a week | 14 |
| Daily | 4 |
| Several times a day | 7 |
The highest reported usage was 50 times/day.
| Key Length | Responses |
|---|---|
| 384 bits | 0 |
| 512 bits | 1 |
| 768 bits | 2 |
| 1024 bits | 33 |
| 1280 bits | 1 |
| 2048 bits | 2 |
| 1024/2048 bits | 5 |
| No response | 1 |
Almost everyone is using long keys. Only three responders were using a public key less than 900 bits.
| Minimum | 8 |
| First Quartile | 14 |
| Median | 21 |
| Third Quartile | 41 |
| Maximum | 100 |
| Minimum | 0 |
| First Quartile | 2 |
| Median | 4 |
| Third Quartile | 7 |
| Maximum | 15 |
| Amount | Number |
|---|---|
| All | 15 |
| Some | 15 |
| None | 15 |
The median fraction of words in an English Dictionary was 5 out of 8. Of the 15 who reported none, 4 indicated use of non-English dictionary words.
Average Word Length Dividing the number of characters by the number of words gives an average pass phrase word length for each responder. The distribution of these values was: (characters)
| Minimum | 3.3 |
| First Quartile | 4.5 |
| Median | 5.3 |
| Third Quartile | 6.9 |
| Maximum | 86.0 |
The median average word length for respondents whose pass phrase was composed entirely of English dictionary words was also 5.3 characters. This is shorter than the median length of words in an English dictionary [DAV] (6 based on a sample of 100), suggesting that pass phrase words are not chosen randomly.
| Yes | 2 |
| No | 43 |
Almost no one admits to writing down their pass phrase. This is in accord with conventional wisdom, but in view of the numerous short pass phrases reported, it may be bad advice. See my article in Internet Secrets [REI].
Most respondents included some comment. See Appendix A, below. Several respondents included good suggestions for choosing a pass phrase. A few indicate some incorrect beliefs about pass phrases including:
| Transmission Method | Responses |
|---|---|
| Anonymous snail- mail | 18 |
| Anonymous, encrypted e-mail | 9 |
| Encrypted e-mail | 3 |
| Anonymous e-mail | 3 |
| Open e-mail | 12 |
The medium which responders used to forward their answers to me is of interest in itself. To eliminate any possibility that the data collected could compromise someone's password, I had asked in my first posting that responses only be sent via conventional mail with no return address, i.e. "via anonymous snail-mail." I felt that this method provided the highest level of anonymity.
Despite my request, I received about as many e-mail as snail-mail replies to the first posting. A number of the e-mail replies were anonymous. Several responders asked for my PGP public key.
My subsequent repostings included my public key but still encouraged snail mail response by offering to contribute the price of postage on all replies received through end of April, 1995 to the Phil Zimmerman defense fund. The repostings did recommend anonymous, PGP- encrypted responses as a less secure alternative.
A pass phrase consisting of dictionary words is weaker than the same size pass phrase made up of random letters. To try to compare different responses on a single scale, I estimated the entropy of each responder's pass phrase using the following approximation:
Est_Entropy = 15*num_of_dictionary_words + 5.5*num_of_characters* (1 - num_of_dictionary_words/num_of_words)
This formula assigns 15 bits of entropy to each English dictionary word and 5.5 bits per character to non-dictionary words. It is a very crude formula, and I believe it tends to overestimate the entropy of most pass phrases, but it does allow some further analysis of the survey data. Here are some results using this formula:
| Minimum | 30 |
| First Quartile | 60 |
| Median | 75 |
| Third Quartile | 157 |
| Maximum | 473 |
Most respondents are using a pass phrase with substantially less entropy than the IDEA 128 bit session key.
| Frequency of Use | Median Estimated Bits of Entropy |
|---|---|
| Rarely | 60 |
| Once a month | 110 |
| Once a week | 75 |
| Several times a week | 104 |
| Daily | 69 |
| Several times a day | 90 |
There seems to be no correlation between frequency of PGP usage and pass phrase strength.
| PGP Key Length in Bits |
Responses | Median Estimated Bits of Entropy |
|---|---|---|
| <900 | 3 | 66 |
| 1024 | 33 | 75 |
| >1100 | 8 | 87 |
Those who select stronger keys seem to choose stronger pass phrases as well.
| Response Method | Median Estimated Bits of Entropy |
|---|---|
| Snail mail | 75 |
| Anonymous, encrypted e-mail | 105 |
| Encrypted e-mail | 201 |
| Anonymous e-mail | 55 |
| Open e-mail | 65 |
No strong trend is evident. However, it appears that those responders that used PGP in answering the survey (despite my urgings) have stronger pass phrases:
| Response Type | Responses | Median Estimated Bits of Entropy |
|---|---|---|
| Response not encrypted | 33 | 75 |
| Response encrypted | 12 | 105 |
In his paper Efficient DES Key Search, Michael J. Wiener [WEI] describes a machine that could exhaustively search the 56 bit DES key space in 3.5 hours. He estimated that the machine would cost $1 Million to build in 1993. This method assumes knowledge of a block of plaintext and its matching cyphertext, the so-called "known plaintext" attack.
An attacker who had someone's pass-phrase-protected secret key but lacked the pass phrase itself has all the information needed for a known plaintext attack on the pass phrase. The attack is somewhat more complex than in the DES case because the generation of possible pass phrases is harder and less certain than the enumeration of DES keys, and because the algorithm that must be executed to test each pass phrase is more complex, requiring both an MD5 and an IDEA pass.
To get a rough estimate of the cost of a pass phrase attack, let's assume that a PGP pass phrase engine that could try 2^56 pass phrases in 3.5 hours could be built for $2 million using 1995 technology. Amortizing the cost over 3 years and assuming 24 hour/day operation gives a capital cost of $76/hour. Adding in $14/hr for power consumption, operator time and floor space, gives a total cost or $90/hr or $315/set of 2^56 pass phrases tested.
This cost estimate implies the following rough relationship between bits of pass phrase entropy and the level of protection afforded in terms of cost of attack:
| Bits | Cost of Attack (1995) |
|---|---|
| 56 | $315 |
| 60 | $5,000 |
| 64 | $81,000 |
| 68 | $1,290,000 |
| 72 | $21,000 ,000 |
| 76 | $330,000,000 |
| 80 | $5,300,000,000 |
To allow for progress in electronics, one bit of entropy should be added every 2 to 3 years.
Using my rough estimate, 31% of responders had pass phrases with 60 bits of entropy or less; 20% had less than 56 bit of entropy.
This study, crude as it is, suggests that a significant minority of PGP users are using inadequate pass phrases. Before you say "Well, my pass phrase is long enough," remember that in PGP, as in all public key systems, the security of the messages you send depends on the security of the recipient's secret key, not on your own safeguards. And, in general, you have no way of knowing how careful he or she is.
Of course, an attacker that was able to purloin someone's pass-phrase- protected secret key might also be in a position to bug their keyboard or to plant a program that would capture their pass phrase. Still, it is good practice to make every layer of security as effective as practical. Judging from their comments, users with apparently weak pass phrases often thought they were adequate.
I believe there are there are two paths to strengthening PGP pass phrases: education and improvements to PGP itself.
It should be possible to develop a consensus on minimal standards for pass phrases. My recommendations would include, as a minimum:
In addition, I think it is time to reconsider the "never write down your pass phrase" slogan. If writing down at least a portion of one's pass phrase leads to stronger pass phrase choices, it might be a good practice to recommend.
There are a couple of ways PGP could be improved that would reduce the risk of pass phrase compromise:
The Wiener DES engine, described above, assumed one DES trial every 20 nanoseconds in each of the parallel processing nodes. Increasing the time to 0.2 seconds would make such an engine 10,000,000 times more expensive.
Also, by using large amounts of memory and as much of the power of the personal computer's microprocessor as possible, the silicon footprint of each node would be greatly increased. Each node in Wiener's DES design had 26,000 equivalent gates. An MD5/IDEA design would require several times as many, say 100,000. On the other hand, simulating the essential parts of a 486-class microprocessor and 1 megabyte of memory might require 10,000,000 equivalent gates, complicating the design of a search engine by another factor of 100.
Combined, the two effects described above could make a pass phrase attack one billion times harder, with little negative impact on PGP users.
Currently (as I read the source code) PGP uses the MD5 hash of the pass phrase as the IDEA key for encrypting the secret key. Instead, PGP could use the pass phrase as the initial value for a computation-time intensive hash algorithm optimized to use as much of the processing resources in a typical personal computer as possible, including wide word multiplies, branches and lots of RAM.
A good starting place for the design of a computation-time intensive hash algorithm might be the Randomizing by Shuffling method discovered by Carter Bays and S. D. Durham [BAY] and described in Knuth's The Art of Computer Programming [KNU]. The auxiliary table would be large, on the order of 1 megabyte, and filled using a 32-bit linear-congruential pseudo-random number generator. MD5 or SHA passes every so often would add security, but these hash algorithms should not be a large component of the compute time since they are optimized for hardware implementation.
In addition, each encrypted secret key should have "salt" stored with it to prevent an attacker from developing a dictionary of IDEA keys that match common pass phrases.
The computation-time intensive hash algorithm should have the number of iterations and amount of memory used as parameters. When generating secret keys or when changing pass phrases, users could be given a choice of levels of protection. Each level would specify the number of iterations and amount of memory. This would insure interoperability between different machines.
As PGP's gains wider acceptance, new users will likely be even less careful in pass phrase selection and less willing to use long pass phrases than the "early adopters" who responded to this survey. Adding a factor of 10^9 in the difficulty of recovering pass phrases is roughly equivalent to adding 30 bits of entropy to each pass phrase and would significantly improve the security of PGP for all users.
References
[BAY] C. Bays and S. D. Durham, ACM Trans. Math. Software 2, 1976, pp. 59-64
[DAV] P. Davies, Ed., "The American Heritage Dictionary of the English Language," (55,000 entries), Dell, 1973
[KNU] D. E. Knuth, "The Art of Computer Programming," Vol. 2 "Semi- Numerical Algorithms," Second Edition, Sec. 3.2.2, Algorithm B, pp. 32- 33, Addison-Wesley, 1973
{REI] A. G. Reinhold, "Common Sense and Cryptography," in Internet Secrets, J. R. Levine and C. Baroudi, Ed., IDG Books, 1995, p. 148
[SUK] P. K. Suk, "Re: A Good Solution to the Passphrase Question," posted to alt.security.pgp, 4 Apr 1995 23:21:06 EDT, suk@usceast.cs.scarolina.edu
[WEI] M. J. Wiener, "Efficient DES Key Search," Bell-Northern Research, Ottawa, Ont., Canada, 1993, available at http://www.eff.org/pub/EFF/ Policy/Crypto/Misc/Technical/des_key_search.ps.gz
Copyright (c) 1995, Arnold G. Reinhold, Cambridge, Mass. USA. The author hereby grants rights for free non-commercial electronic distribution of the entire text with attribution and signature attached.
-----BEGIN PGP SIGNATURE-----
Version: 2.6
iQCVAwUBL84Tx2truC2sMYShAQF3tgP/RI3OQNHMu9GmCi7713DeXtzGKPeSYRRF
ti6EBsOdu8R1BdFVrW5/nBWG7HqcM0uNVl4Uy2kCAszb4Tonvsaf0qY0Dbw88EyE
EyKcfIrFZWSFHn+DlblwzxgnDiYe8owYxDuzCy4Y3kIyGlc8pFXjljMBbKLslog5
PrRHbp6OkrY=
=dYyx
-----END PGP SIGNATURE-----
Back to @Man's Homepage