9 minutes
ICOM6045 Cryptography
Overview
Security threats (explain basically)
- content
- eavesdropping: the message would be listened by others when you communicate with your friends, using email or whatsapp
- masquerading: identity stealing
- message tampering: someone modify the message in the middle
- replaying: receive the same message twice, it is dangerous in transactions, but not in instant message, etc
- machine
- infiltration: receive marewares (can control PC or smart phones) from hackers
- traffic
- traffic analysis: only interestes in how to communicate
- denial of service: make others cannot access the services
Security services (implemented by security mechanisms)
- security policy
- from risk analysis, we can have some security policies to deal with
- may not cover all possible risks, should have a reasonable trade-off between risks and available resources
- ISO security architecture
- authentication
- access control
- data confidentiality
- data integrity
- nonrepudiation
- basic computer security concepts (CIA)
- related to cryptography
- confidentiality: prevent unauthorized access
- integrity: prevent unauthorized modification
- related to access control
- availability: prevent unauthorized withholding of information or resources
- authenticity: verify the origin of data
- accountability: audit information and trace to the responsible party if there are security issues
- related to cryptography
Encryption and Decryption (focus on classical ones)
Types (still use today)
- symmetric key encryption (single key cryptosystem)
- using the same encryption and decryption key
- e.g. AES, 3DES, RC4, pdf password
- usage:
- file encryption
- others with lots of data
- rely on substitution and permutation, do not need to much computing
- using the same encryption and decryption key
public key encryption (two key cryptosystem)
- using different encryption and decryption keys
- public key (encryption key): everyone knows it
- private key (decryption key): only I know
- e.g. RSA (still popular today)
- usage:
- communication
- digital signature
- others not cover lots of data
- rely on complex math, need much computing
if you use the e-banking system or https, you should use both of them
- using different encryption and decryption keys
Monoalphabetic substitutions
- concept
- substitution: each letter is substituted by another one
- permutation: reorder the elements of a series
- caesar cipher
- each letter is translated to the letter a fixed number of letters after it in the alphabet
- Julius Caesar used a shift of 3
- attack (short message)
- find the similarity of ciphertext (e.g. "wr" means "to" -> "wrr" means "too")
- guess the pattern(the number of shifting)
- each letter is translated to the letter a fixed number of letters after it in the alphabet
- cryptanalysis (mainly for long message)
- count the frequency of the letter, find the difference according to the frequency of normal articles
- weakness: the frequency of distribution reflects the distribution of the underlying alphabet
- make it stronger -> polyalphabetic substitution ciphers
- using combine or multiple distributions -> different tables
- if you use 2 tables, it is not safe (if someone can guess the tables)
- 26 tables may be safer, but it is difficult to maintain
- solution (vigenere cipher): use a keyword (not be encrypted, but use different method or channel), letters in the key will be used to select a particular permutation (e.g. juliet, which needs to be repeated)
- make it stronger -> polyalphabetic substitution ciphers
index of coincidence
a measure of the variation between frequencies in a distribution (n is the number of ciphertext)
\[IC=\sum_{i=a}^{i=z}\frac{Freq(i)\times(Freq(i)-1)}{n\times(n-1)}\]
IC ranges from 0.0384 (polyalphabetic substitution with perfectly flat distribution) to 0.068(monoalphabetic substitution from common English)
Alphabets IC 1 0.068 2 0.052 3 0.047 4 0.044 5 0.044 10 0.041 large 0.038 the value of IC
- close to or above 0.068 -> monoalphabetic substitution
- close to or below 0.038 -> polyalphabetic substitution
kasiski method: determine when a pattern of encrypting permutations has repeated to predict the number of alphabets used for substitutions
- search for repeated sequence of characters
- calculate the distance between each two of them
- the estimated key length is the common divisor
analyze polyalphabetic cipher
- use the kasiski method to predict likely numbers of enciphering alphabets
- if no numbers emerge fairly regularly, the encryption is not probably not a polyalphabetic substitution
- divide the ciphertext into serval sequences, according to the key length
- calculate IC for each sequence
The "perfect" substitution cipher
- use infinite or a large nonrepeating keys
- problems
- absolute synchronization between sender and receiver, difficult to distribute
- need an unlimited number of keys
- problems
vernam cipher
- use random number as the key
e.g.
plaintext V E R N A M C I P H E R encoded 21 4 17 13 0 12 2 8 15 7 4 17 random 76 48 16 82 33 3 58 11 60 5 48 88 sum 97 52 33 95 33 15 60 19 75 12 52 105 mod 26 19 0 7 17 18 15 8 19 23 12 0 1 ciphertext T A H R S P I T X M A B possible attack
- random number generator
- most common type of random number generator: linear congruential random number generator
- seed: the initial value \(r_0\)
successive random number \(r_{i+1}\)
\[r_{i+1}=(a\times r_i+b)\mod\ n\]
- if someone knows a, b and n, he or she can use the current random number to calculate the next random number
cracking the random number generator becomes solving a systems of equations
Permutations (transpositions)
- rearrange the symbols of a message and try to break established patterns
- columnar transposition: rearrangement of the characters of the plaintext into columns
- e.g. THIS IS A MESSAGE TO SHOW HOW A COLUMN TRANSPOSITION WORKS
- cryptanalysis
- compute the letter frequencies: all letters appear with their normal frequencies implies that a transposition has been performed
- break the text into columns by compare a block of ciphertext characters against characters successively farther away in the ciphertext
- columnar transposition: rearrangement of the characters of the plaintext into columns
- complexity
- the execution time of the algorithm is proportional to the length of the message
- store all characters
- output cannot be generated until all characters have been read (not good for long message)
- alternative method: permute the characters of the plaintext with a fixed period d
- e.g. d = 5, and the permutation is (2 4 5 1 3)
Confusion and Diffusion
- confusion: change symbols
- bad confusion: caesar cipher
- good confusion: polyalphabetic substitution with a long key
- diffusion: change order
- bad diffusion: the substitution ciphers
- good diffusion: the transposition ciphers
- DES provides good confusion and diffusion
- substitution provides confusion
- permutation provides diffusion
- purpose of cryptography -> data confidentiality
Ciphers
- stream cipher: convert one symbol of plaintext immediately into a symbol of ciphertext, e.g. the substitution cipher
- advantages:
- speed of transformation
- low error propagation: each symbol is separately encoded
- disadvantages:
- low diffusion: all information of a symbol is contained in the symbol, subject to attack like frequency count, digram analysis, IC and Kasiski method
- possible for malicious insertions and modifications
- advantages:
- block cipher: encrypt a group of plaintext symbol as one block, e.g. the columnar transposition cipher
- advantages:
- diffusion: information of plaintext is diffused into several ciphertext symbols
- immunity to insertions: impossible to insert a single symbol in a block
- disadvantages:
- slowness of encryption
- error propagation: an error will affect all other characters in the same block
- advantages:
Rotor Machine
- contains three rotors (key)
- is implemented by polyalphabetic substitution (not fixed substitution)
- change one position, change the key
- the number of keys is \({26}^3\)
- procedure
- use original setting to do substitution
- after one input, the next input may generate the rotation of the first rotor (one position)
- if the first rotor rotates a cirle, jump to the second rotor
- after one input, the next input may generate the rotation of the second rotor (one position)
- until three rotors finish their rotation -> original status -> again and again
- utility: enigma machine (by German, with a 256-element rotor, use all rotors twice for each letter)
- example
Modern Cryptography
DES (data encryption standard)
- a block cipher with 56-bit key (64-bit including parity bits), 64-bit block
- the reason for 64-bit: 64-bit is 8 bytes, each character uses 1 byte in the computer, 8 characters means we can use ascii code to represent
- parity bits: just like checksum, solving the nosie problem, but today we may not use this
- most commonly use block cipher
- in 1990s, this method was not safe any more, then generate "triple DES" (1999) and AES (2001, use until now)
- 3-DES
- still use today, but someone thinks out of date
- use this method due to orginial DES hardware can be used (save money)
- based on "Feistel" network structure
- means for both encryption and decryptionn use the same hardware (box)
- DES use this structure for 16 times, but using different subkeys
- method, ciphertext, etc. are public, but you cannot decrpyt it (due to lots of substitution and permutation)
- mathmatics
- \(L_{i+1}=R_i\)
- \(R_{i+1}=L_i\bigoplus f(K_i,R_i)\)
- inverse is the same
- \(R_i=L_{i+1}\)
- \(L_i=R_{i+1}\bigoplus f(K_i,L_{i+1})\)
- f function
- S-Box
- public
- fixed (6 bits -> 4 bits)
- design criteria (early undisclosed)
- the S-boxes were carefully tuned to increase resistance against differential cryptanalysis
- no S-box is a linear or affine function of its input: the 4 output bits cannot be expressed as a system of linear equations of the 6 input bits
- changing 1 bit in the input of an S-box results in changing at least 2 output bits: the S-boxes diffuse their information well throughout their outputs
- the S-boxes were chosen to minimize the difference between the number of 1s and 0s when any single input bit is held constant: holding a single bit constant as a 0 or 1 and changing the bits around it should not lead to disproportionately many 0s or 1s in the output
- P-Box
- public
- fixed
- the only secret is the key, without it, you cannot decrypt the ciphertext
- but it is not safe now, due to computing power (only \(2^{56}\) possible keys)
- S-Box
- designed to facilitate hardware implementation
AES (advanced encryption standard)
- mandates a block size of 128 bits and a choice of key size from 128, 192, 256 bits (called AES-128, AES-192, AES-256)
- using 128-bit key (16 bytes, 4 4-octet)
- diffusion and confusion
- diffusion: by permutations in ShiftRows and MixColumns
- confusion: by substitutions (S-Box) in SubBytes
Hash
- cryptographic hash function
- compression: for any size of input x, the output length of y = h(x) is small
- features
- efficiency: easy to compute h(x) for any input x
- one-way: given any value y, it is computationally infeasible to find a value x such that h(x) = y
- collision resistance
- given x and h(x), if is infeasible to find y ≠ x such that h(x) = h(y)
- it is infeasible to find x and y, with y ≠ x, such that h(x) = h(y)
- common hash functions: MD5, SHA-1, SHA-256
- basic structure of SHA-1
- split message into 512-bit blocks
- the former result would be the next input
- if the memory is spare, padding
- basic structure of SHA-1
- HMAC (hash message authentication code)
- today, we use digital signature to verfity identity
- in old days, we uese HMAC to verify indentity
- the use of a shared secret key that provides authentication in addition to integrity protection
- features
- faster than encryption in software
- use of any hash function is permitted in any export product from US
- used in IPsec and TLS