PPDD Practical Privacy Disc Driver

Introduction

This is the second version of the specification. It relates to version 0.7
and later of PPDD.


What is ppdd?

PPDD is a disc driver and support programs for Intel based systems running
Linux. It allows secure encryption of all data on disc including the root
filesystem and the swap file.


Threat Scenario

PPDD is designed to protect against unwanted access to data on the discs of
a computer running Linux on Intel x86 based machines. The threats covered
are:

	Someone steals the powered-off computer and discs.
	Someone steals or copies one or more backups.
	Someone boots the computer using a floppy boot disc and copies
		the data from the hard discs.

Although PPDD will prevent an unauthorised person having access to the data
it will not in any way aid the recovery of that data if the attacker
destorys it or if he steals the computer.

No protection is offered against any other threats, paticularly:

	Access to data by an unauthorised user on a multi-user system.
	Any kind of access from another computer via a network.
	Access by a "trojan horse" type program or similar.
	Physical access to the computer after access to the encrypted
		discs has been enabled.

For those threats covered protection rests on:

	The use of an adequate pass phrase.
	The secrecy afforded to this pass phrase.
	The following of a set procedure related to making backups.
	The security of the "Blowfish" algorithm.
	The correct implementation of the algorithm.
	The absence of compromising bugs in the programs.

Ppdd also offers some degree of protection against tampering with the data
it protects.  This is provided by optional checksums. The protection is
incomplete because it relies on having at least a kernel and checksum
checking programs which themselves have not been tampered with. That in turn
relies on physical protection measures outside the scope of ppdd.  This applies
to all integrity checking and tamper detecting schemes (e.g. Tripwire).

However the checksum facilities make life harder for the attacker at little
extra cost.


Design considerations

The design is required to provide the security needed to meet the identified
threats. It must also run at a speed which makes its use a practical
proposition for an "average" home computer. It should be easy to install the
system and require the minimum of tailoring and parameters.

The ultimate objective is that it can be installed and use by those who are
not expert in either Linux or cryptography.

The cryptography must take into consideration that the lifetime of the data
on disc is likely to be very long - maybe many years. Special consideration
must be given to protecting backups. The fact that the plaintext of much of
the data will already be known needs to be taken into account.


Master/Working pass phrase concept

One problem with all disc encryption systems is that backups may copied or
stolen. A backup should be useless to an attackers. Even multiple backups
should not reveal any significant information. The following concepts are
available:

	1. Encrypt the whole backup with its own key.
	2. Destroy the control block before a backup.
	3. Carefully protect the pass phrase of the control block.

Option 1 is useful. The place to encrypt is the backup software itself. This
is outside the scope of PPDD as it stands today. The user may well consider
option 1 as well as what is offered by PPDD.

Option 2 is very effective but needs careful control. The control block must
be backed up just once and several copies made e.g. to floppy discs. The
security of the system depends on the physical security of these copies and
the secrecy of the pass phrase used at the time the control block was backed
up. Before the encrypted file system is backed up the control block can be
destroyed by overwriting it with random data. When a backup needs to be 
restored the control block must be recovered from the floppy disc. A feature
of this technique, which may be viewed as good or bad, is that by destroying
the floppies and the control block on the live file system the whole data
can be completely destroyed. It's faster than a shredder.

In a simple system, option 3 is very risky. The pass phrase is used too often
that it can be well protected. Furthermore there is no protection obtained by
frequently changing the pass phrase. Quite the reverse. The attacker with the
backups need only one of the pass phrases to be able to decrypt all the backup
copies he has.

PPDD offers option 2 and also a derivative of option 3 which overcomes the
disadvantages. There are two pass phrases associated with the file system.
These are called the "master pass phrase" and the "working pass phrase".
Do not confuse these with the fact that PPDD offers the user two lines of
pass phrase entry - that is to make longer pass phrases easier to use.

The master pass phrase is the one the user enters at the time he creates the
PPDD filesystem. It can be changed at any time later. This pass phrase must
be very good quality. Ideally it is never changed. The objective of the
master/working pass phrase technique is to minimise the use of the master
pass phrase.

The working pass phrase is entered using one of the PPDD maintenance programs.
It can then be used in all circumstances as if it were the master pass phrase.
When a backup is to be made the working pass phrase is first destroyed. This
means that the backup can only be decrypted using the master pass phrase. The
working pass phrase must be reinstated after the backup.

It is intended to add backup routines to PPDD to make this procedure more
foolproof.


Implementation - device

The "Linux" side of PPDD is based on the loop device. The author wishes to
thank all those have contributed to the loop device on Linux. Although the
loop device is the basis for the driver, PPDD uses none of the cryptography
from the loop driver.


Implementation - cryptography

The "Cryptography" side of PPDD is based on Blowfish as the encryption
algorithm. The author wishes to thank Bruce Schneier for this algorithm,
which is in the public domain and not protected in any way by patents.

The first 1024 bytes of the file are reserved for keys and other control
information and are never read or written by the device. The control programs
make use of this information.

These 1024 bytes contain an initial seed for the passphrase. This makes a pre-
computed pass phrase dictionary useless. The same pass phrase produces the
same key only when the initial seed is the same. The initial seed is 8 bytes
and is chosen at random.

The key for the 1024 byte block is derived from the pass phrase and is used to
encrypt most of this block in ecb mode.

In the encrypted part of the block are the keys for the database and iv data
needed for the encryption process. PPDD makes use of a "master/working" pass
phrase system. The key derived from the master pass phrase is held in the
block and is encrypted with the key derived from the working pass phrase.

The block contains a checksum to allow the pass phrase to be verified with
reasonable certainty.

The key is derived from a pass phrase as follows:

	The user is offered two lines of 104 characters to enter a pass phrase.
	If a pass phrase is longer than 104 characters it is truncated.
	An IV of 8 bytes is added to the front of the first line.
	If the resulting length is greater than 56 characters then fold the
		pass phrase by XORing those characters after the 56th with
		characters in the first part of the pass phrase.
	The resulting string, up to a maximum of 56 characters is used to
		derive a blowfish key.
	This blowfish key is used to encrypt the original IV which is then
		used as the IV for the second line of pass phrase.
	If the second line of the pass phrase plus IV is less than 16 bytes
		then pad it to 16 bytes with the original IV.
	The original IV is appended to the pass phrase to ensure that the
		next step can always operate on a multiple of 8 bytes.
	The second pass phrase line is then encrypted in cbc mode using the key
		derived from the first pass phrase line.
	The last 16 bytes of the encrypted pass phrase form 16 bytes of the
		final key.
	The above process is repeated interchanging the first and second lines
		of the pass phrases.
	The final key is made up of the two blocks of 16 bytes, giving a total
		key length of 32 bytes i.e 256 bits.


The process of "opening" the control block is as follows:

	The user enters a pass phrase.
	The first 8 bytes are read from the control block.
	A 256 byte key is derived based on this phrase and the above 8 bytes.
	This key is used to decrypt the block except for the first 8 bytes.
	The checksum of the block is compared with the checksum in the block.
	If the checksum is good the block is "open".
	If not then the key is used to decrypt the part of the block which
		contains the master key encrypted with the "working" phrase.
	The master key is then used to decrypt the block.
	If the checksum is good then the block is "open" otherwise the pass
		phrase is wrong.


Except for the first 1024 bytes the data on disc is encrypted as follows:
	
	The disc is treated as if it were made up of 512 byte blocks.
	The data in the block is spread evenly throughout the block by a
		process known as whitening.
	This process requires an IV and this is derived from the block number
		and from a table of IVs held in the control block.
	The whitening process also provides an IV which is used in the cbc
		encrytption of the block using a key from the control block.

	The control block contains 17 keys and 61 random 4 byte IVs.
	The key used on a block is derived by dividing the block number by 17
		and using the remainder as a subscript to the key array.
	The whitening IVs are the block number itself and the IVs from the
		control block derived by dividing the block number by 59 and 61	
		and using these remainders and using these as subscripts.

	The result is that no block is encrypted the same way. Also any change
	in the data in a block results in a completely different 512 byte block
	after encryption. The same data block at different places on the disc
	encrypts completely differently.

Decryption is similar except that the cbc decryption step must take place in
two stages because the IV for the first part of the block is known only after
the second part of the block has been decrypted. The whitening routine is of
course reversable.


Implementation - random numbers

PPDD uses two sources of random numbers. For those which are less important
then the Linux "urandom" device is used. For high grade random numbers PPDD
uses the Linux "random" device. The random number generators have been
modified by using whitening and by repeated use of "urandom" in case there
is any problem discovered later with the Linux method of generating random
numbers. When high grade random numbers are needed then random input from
the user - in the form of random keystrokes - is added to whatever Linux
provides.

Allan Latham April 1999