Hill Cipher

The Hill cipher is considered to be the first polygraphic cipher in which it is practical to work on more than three symbols at once.

Usage

The Hill cipher was created in 1929 by Lester S. Hill, an American mathematician.

Algorithm

In the Hill cipher each letter corresponds to one unique number, from 0 to 25. The simplest scheme (A = 0, B = 1, ..., Z = 25) is used the most often but one can choose other combination as well.

Messages are divided into n-letter blocks. Encryption is performed by multiplication of all blocks by one n x n secret matrix, which contains also numbers from 0 to 25. All the results should be modulo 26. The matrix can be defined based on a secret keyword, which contains n2 letters (one should just ignore other unnecessary letters).

The decryption algorithm is similar to the encryption process. One should divide the ciphertext into blocks (each with n letters) and multiply them by the inverse of the matrix modulo 26 used for encryption.

Encryption and Decryption

To encrypt a message vino using a 2 x 2 matrix, one should divide the message into two blocks of two letters. Then one should change the letters into numbers:

  
V
I
  ,  
N
O
  -->  
21
10
  ,  
14
15

The matrix below wwill be used as a secret key:

  K =  
33
25

It is easy to calculate the inverse of the matrix modulo 26 using for decryption:

  K-1 =  
1517
209

Encryption of two plaintext blocks is about multiplication them with the key matrix:

33
25
  x  
21
10
  =  
93 mod 26
92 mod 26
  =  
15
14

33
25
  x  
14
15
  =  
87 mod 26
103 mod 26
  =  
9
25

The result is a 4-digit sequence 15 14 9 25 so oniy. Decryption is similar to encryption. Ciphertext letters are changed into digits, divide into two 2-digit blocks and multiplying with the inverse of the matrix K-1:

1517
209
  x  
15
14
  =  
463 mod 26
426 mod 26
  =  
21
10

1517
209
  x  
9
21
  =  
560 mod 26
405 mod 26
  =  
14
15

The received four number can be changed into the original plaintext letters vino.

Security of the Hill cipher

The basic version of Hill cipher is vulnerable to known-plaintext attacks because the whole encryption process is linear. The intruder who intercepts n2 pairs of plaintext and ciphertext corresponding letters can create a linear system which can be quite easily solved. Adding a few more pairs allows to choose a correct solution if more than one was discovered. Solving linear equations usually doesn't take much time.

Number of possible keys

There are 26n2 possible matrices of dimension n x n, which can contain only numbers from 0 to 26. However to consider a matrix as a potential secret key, the matrix must be invertible.

The number of invertible matrices can be computed using the Chinese Remainder Theorem. A matrix is invertible modulo 26 if and only if it is invertible both modulo 13 and modulo 2.

The number of invertible n x n matrices modulo 2 is determined by the order of the general linear group GL(n,Z2):
    2n2(1 - 1/2) (1 - 1/22) ... (1 - 1/2n)

Similarly, the number of invertible n x n matrices modulo 13 can be calculated in the same way:
    13n2(1 - 1/13) (1 - 1/132) ... (1 - 1/13n)

Finally, the number of such invertible matrices modulo 26 is equal to the product of the two above results:
    26n2(1 - 1/2) (1 - 1/22) ... (1 - 1/2n)(1 - 1/13) (1 - 1/132) ... (1 - 1/13n)

This may be approximated as 4,64 n2 - 1,7, which is about 116 bits for 5 x 5 matrix.