PPDD Practical Privacy Disc Driver Introduction This is the second version of the specification. It relates to version 0.7 and later of PPDD. What is ppdd? PPDD is a disc driver and support programs for Intel based systems running Linux. It allows secure encryption of all data on disc including the root filesystem and the swap file. Threat Scenario PPDD is designed to protect against unwanted access to data on the discs of a computer running Linux on Intel x86 based machines. The threats covered are: Someone steals the powered-off computer and discs. Someone steals or copies one or more backups. Someone boots the computer using a floppy boot disc and copies the data from the hard discs. Although PPDD will prevent an unauthorised person having access to the data it will not in any way aid the recovery of that data if the attacker destorys it or if he steals the computer. No protection is offered against any other threats, paticularly: Access to data by an unauthorised user on a multi-user system. Any kind of access from another computer via a network. Access by a "trojan horse" type program or similar. Physical access to the computer after access to the encrypted discs has been enabled. For those threats covered protection rests on: The use of an adequate pass phrase. The secrecy afforded to this pass phrase. The following of a set procedure related to making backups. The security of the "Blowfish" algorithm. The correct implementation of the algorithm. The absence of compromising bugs in the programs. Ppdd also offers some degree of protection against tampering with the data it protects. This is provided by optional checksums. The protection is incomplete because it relies on having at least a kernel and checksum checking programs which themselves have not been tampered with. That in turn relies on physical protection measures outside the scope of ppdd. This applies to all integrity checking and tamper detecting schemes (e.g. Tripwire). However the checksum facilities make life harder for the attacker at little extra cost. Design considerations The design is required to provide the security needed to meet the identified threats. It must also run at a speed which makes its use a practical proposition for an "average" home computer. It should be easy to install the system and require the minimum of tailoring and parameters. The ultimate objective is that it can be installed and use by those who are not expert in either Linux or cryptography. The cryptography must take into consideration that the lifetime of the data on disc is likely to be very long - maybe many years. Special consideration must be given to protecting backups. The fact that the plaintext of much of the data will already be known needs to be taken into account. Master/Working pass phrase concept One problem with all disc encryption systems is that backups may copied or stolen. A backup should be useless to an attackers. Even multiple backups should not reveal any significant information. The following concepts are available: 1. Encrypt the whole backup with its own key. 2. Destroy the control block before a backup. 3. Carefully protect the pass phrase of the control block. Option 1 is useful. The place to encrypt is the backup software itself. This is outside the scope of PPDD as it stands today. The user may well consider option 1 as well as what is offered by PPDD. Option 2 is very effective but needs careful control. The control block must be backed up just once and several copies made e.g. to floppy discs. The security of the system depends on the physical security of these copies and the secrecy of the pass phrase used at the time the control block was backed up. Before the encrypted file system is backed up the control block can be destroyed by overwriting it with random data. When a backup needs to be restored the control block must be recovered from the floppy disc. A feature of this technique, which may be viewed as good or bad, is that by destroying the floppies and the control block on the live file system the whole data can be completely destroyed. It's faster than a shredder. In a simple system, option 3 is very risky. The pass phrase is used too often that it can be well protected. Furthermore there is no protection obtained by frequently changing the pass phrase. Quite the reverse. The attacker with the backups need only one of the pass phrases to be able to decrypt all the backup copies he has. PPDD offers option 2 and also a derivative of option 3 which overcomes the disadvantages. There are two pass phrases associated with the file system. These are called the "master pass phrase" and the "working pass phrase". Do not confuse these with the fact that PPDD offers the user two lines of pass phrase entry - that is to make longer pass phrases easier to use. The master pass phrase is the one the user enters at the time he creates the PPDD filesystem. It can be changed at any time later. This pass phrase must be very good quality. Ideally it is never changed. The objective of the master/working pass phrase technique is to minimise the use of the master pass phrase. The working pass phrase is entered using one of the PPDD maintenance programs. It can then be used in all circumstances as if it were the master pass phrase. When a backup is to be made the working pass phrase is first destroyed. This means that the backup can only be decrypted using the master pass phrase. The working pass phrase must be reinstated after the backup. It is intended to add backup routines to PPDD to make this procedure more foolproof. Implementation - device The "Linux" side of PPDD is based on the loop device. The author wishes to thank all those have contributed to the loop device on Linux. Although the loop device is the basis for the driver, PPDD uses none of the cryptography from the loop driver. Implementation - cryptography The "Cryptography" side of PPDD is based on Blowfish as the encryption algorithm. The author wishes to thank Bruce Schneier for this algorithm, which is in the public domain and not protected in any way by patents. The first 1024 bytes of the file are reserved for keys and other control information and are never read or written by the device. The control programs make use of this information. These 1024 bytes contain an initial seed for the passphrase. This makes a pre- computed pass phrase dictionary useless. The same pass phrase produces the same key only when the initial seed is the same. The initial seed is 8 bytes and is chosen at random. The key for the 1024 byte block is derived from the pass phrase and is used to encrypt most of this block in ecb mode. In the encrypted part of the block are the keys for the database and iv data needed for the encryption process. PPDD makes use of a "master/working" pass phrase system. The key derived from the master pass phrase is held in the block and is encrypted with the key derived from the working pass phrase. The block contains a checksum to allow the pass phrase to be verified with reasonable certainty. The key is derived from a pass phrase as follows: The user is offered two lines of 104 characters to enter a pass phrase. If a pass phrase is longer than 104 characters it is truncated. An IV of 8 bytes is added to the front of the first line. If the resulting length is greater than 56 characters then fold the pass phrase by XORing those characters after the 56th with characters in the first part of the pass phrase. The resulting string, up to a maximum of 56 characters is used to derive a blowfish key. This blowfish key is used to encrypt the original IV which is then used as the IV for the second line of pass phrase. If the second line of the pass phrase plus IV is less than 16 bytes then pad it to 16 bytes with the original IV. The original IV is appended to the pass phrase to ensure that the next step can always operate on a multiple of 8 bytes. The second pass phrase line is then encrypted in cbc mode using the key derived from the first pass phrase line. The last 16 bytes of the encrypted pass phrase form 16 bytes of the final key. The above process is repeated interchanging the first and second lines of the pass phrases. The final key is made up of the two blocks of 16 bytes, giving a total key length of 32 bytes i.e 256 bits. The process of "opening" the control block is as follows: The user enters a pass phrase. The first 8 bytes are read from the control block. A 256 byte key is derived based on this phrase and the above 8 bytes. This key is used to decrypt the block except for the first 8 bytes. The checksum of the block is compared with the checksum in the block. If the checksum is good the block is "open". If not then the key is used to decrypt the part of the block which contains the master key encrypted with the "working" phrase. The master key is then used to decrypt the block. If the checksum is good then the block is "open" otherwise the pass phrase is wrong. Except for the first 1024 bytes the data on disc is encrypted as follows: The disc is treated as if it were made up of 512 byte blocks. The data in the block is spread evenly throughout the block by a process known as whitening. This process requires an IV and this is derived from the block number and from a table of IVs held in the control block. The whitening process also provides an IV which is used in the cbc encrytption of the block using a key from the control block. The control block contains 17 keys and 61 random 4 byte IVs. The key used on a block is derived by dividing the block number by 17 and using the remainder as a subscript to the key array. The whitening IVs are the block number itself and the IVs from the control block derived by dividing the block number by 59 and 61 and using these remainders and using these as subscripts. The result is that no block is encrypted the same way. Also any change in the data in a block results in a completely different 512 byte block after encryption. The same data block at different places on the disc encrypts completely differently. Decryption is similar except that the cbc decryption step must take place in two stages because the IV for the first part of the block is known only after the second part of the block has been decrypted. The whitening routine is of course reversable. Implementation - random numbers PPDD uses two sources of random numbers. For those which are less important then the Linux "urandom" device is used. For high grade random numbers PPDD uses the Linux "random" device. The random number generators have been modified by using whitening and by repeated use of "urandom" in case there is any problem discovered later with the Linux method of generating random numbers. When high grade random numbers are needed then random input from the user - in the form of random keystrokes - is added to whatever Linux provides. Allan Latham April 1999