Overview

Security threats (explain basically)

  • content
    • eavesdropping: the message would be listened by others when you communicate with your friends, using email or whatsapp
    • masquerading: identity stealing
    • message tampering: someone modify the message in the middle
    • replaying: receive the same message twice, it is dangerous in transactions, but not in instant message, etc
  • machine
    • infiltration: receive marewares (can control PC or smart phones) from hackers
  • traffic
    • traffic analysis: only interestes in how to communicate
    • denial of service: make others cannot access the services

Security services (implemented by security mechanisms)

  • security policy
    • from risk analysis, we can have some security policies to deal with
    • may not cover all possible risks, should have a reasonable trade-off between risks and available resources
  • ISO security architecture
    • authentication
    • access control
    • data confidentiality
    • data integrity
    • nonrepudiation
  • basic computer security concepts (CIA)
    • related to cryptography
      • confidentiality: prevent unauthorized access
      • integrity: prevent unauthorized modification
    • related to access control
      • availability: prevent unauthorized withholding of information or resources
      • authenticity: verify the origin of data
      • accountability: audit information and trace to the responsible party if there are security issues

Encryption and Decryption (focus on classical ones)

Types (still use today)

  • symmetric key encryption (single key cryptosystem)
    • using the same encryption and decryption key
      • e.g. AES, 3DES, RC4, pdf password
    • usage:
      • file encryption
      • others with lots of data
    • rely on substitution and permutation, do not need to much computing
  • public key encryption (two key cryptosystem)

    • using different encryption and decryption keys
      • public key (encryption key): everyone knows it
      • private key (decryption key): only I know
      • e.g. RSA (still popular today)
    • usage:
      • communication
      • digital signature
      • others not cover lots of data
    • rely on complex math, need much computing

    if you use the e-banking system or https, you should use both of them

Monoalphabetic substitutions

  • concept
    • substitution: each letter is substituted by another one
    • permutation: reorder the elements of a series
  • caesar cipher
    • each letter is translated to the letter a fixed number of letters after it in the alphabet
      • Julius Caesar used a shift of 3
    • attack (short message)
      • find the similarity of ciphertext (e.g. "wr" means "to" -> "wrr" means "too")
      • guess the pattern(the number of shifting)
  • cryptanalysis (mainly for long message)
    • count the frequency of the letter, find the difference according to the frequency of normal articles
    • weakness: the frequency of distribution reflects the distribution of the underlying alphabet
      • make it stronger -> polyalphabetic substitution ciphers
        • using combine or multiple distributions -> different tables
        • if you use 2 tables, it is not safe (if someone can guess the tables)
        • 26 tables may be safer, but it is difficult to maintain
          • solution (vigenere cipher): use a keyword (not be encrypted, but use different method or channel), letters in the key will be used to select a particular permutation (e.g. juliet, which needs to be repeated)
  • index of coincidence

    • a measure of the variation between frequencies in a distribution (n is the number of ciphertext)

      \[IC=\sum_{i=a}^{i=z}\frac{Freq(i)\times(Freq(i)-1)}{n\times(n-1)}\]

    • IC ranges from 0.0384 (polyalphabetic substitution with perfectly flat distribution) to 0.068(monoalphabetic substitution from common English)

      Alphabets IC
      1 0.068
      2 0.052
      3 0.047
      4 0.044
      5 0.044
      10 0.041
      large 0.038
    • the value of IC

      • close to or above 0.068 -> monoalphabetic substitution
      • close to or below 0.038 -> polyalphabetic substitution
    • kasiski method: determine when a pattern of encrypting permutations has repeated to predict the number of alphabets used for substitutions

      • search for repeated sequence of characters
      • calculate the distance between each two of them
      • the estimated key length is the common divisor
  • analyze polyalphabetic cipher

    • use the kasiski method to predict likely numbers of enciphering alphabets
    • if no numbers emerge fairly regularly, the encryption is not probably not a polyalphabetic substitution
    • divide the ciphertext into serval sequences, according to the key length
    • calculate IC for each sequence

The "perfect" substitution cipher

  • use infinite or a large nonrepeating keys
    • problems
      • absolute synchronization between sender and receiver, difficult to distribute
      • need an unlimited number of keys
  • vernam cipher

    • use random number as the key
    • e.g.

      plaintext V E R N A M C I P H E R
      encoded 21 4 17 13 0 12 2 8 15 7 4 17
      random 76 48 16 82 33 3 58 11 60 5 48 88
      sum 97 52 33 95 33 15 60 19 75 12 52 105
      mod 26 19 0 7 17 18 15 8 19 23 12 0 1
      ciphertext T A H R S P I T X M A B
    • possible attack

      • random number generator
      • most common type of random number generator: linear congruential random number generator
      • seed: the initial value \(r_0\)
      • successive random number \(r_{i+1}\)

        \[r_{i+1}=(a\times r_i+b)\mod\ n\]

        • if someone knows a, b and n, he or she can use the current random number to calculate the next random number
      • cracking the random number generator becomes solving a systems of equations

Permutations (transpositions)

  • rearrange the symbols of a message and try to break established patterns
    • columnar transposition: rearrangement of the characters of the plaintext into columns
      • e.g. THIS IS A MESSAGE TO SHOW HOW A COLUMN TRANSPOSITION WORKS
      • cryptanalysis
        • compute the letter frequencies: all letters appear with their normal frequencies implies that a transposition has been performed
        • break the text into columns by compare a block of ciphertext characters against characters successively farther away in the ciphertext
  • complexity
    • the execution time of the algorithm is proportional to the length of the message
    • store all characters
    • output cannot be generated until all characters have been read (not good for long message)
  • alternative method: permute the characters of the plaintext with a fixed period d
    • e.g. d = 5, and the permutation is (2 4 5 1 3)

Confusion and Diffusion

  • confusion: change symbols
    • bad confusion: caesar cipher
    • good confusion: polyalphabetic substitution with a long key
  • diffusion: change order
    • bad diffusion: the substitution ciphers
    • good diffusion: the transposition ciphers
  • DES provides good confusion and diffusion
    • substitution provides confusion
    • permutation provides diffusion
  • purpose of cryptography -> data confidentiality

Ciphers

  • stream cipher: convert one symbol of plaintext immediately into a symbol of ciphertext, e.g. the substitution cipher
    • advantages:
      • speed of transformation
      • low error propagation: each symbol is separately encoded
    • disadvantages:
      • low diffusion: all information of a symbol is contained in the symbol, subject to attack like frequency count, digram analysis, IC and Kasiski method
      • possible for malicious insertions and modifications
  • block cipher: encrypt a group of plaintext symbol as one block, e.g. the columnar transposition cipher
    • advantages:
      • diffusion: information of plaintext is diffused into several ciphertext symbols
      • immunity to insertions: impossible to insert a single symbol in a block
    • disadvantages:
      • slowness of encryption
      • error propagation: an error will affect all other characters in the same block

Rotor Machine

  • contains three rotors (key)
  • is implemented by polyalphabetic substitution (not fixed substitution)
    • change one position, change the key
    • the number of keys is \({26}^3\)
  • procedure
    • use original setting to do substitution
    • after one input, the next input may generate the rotation of the first rotor (one position)
    • if the first rotor rotates a cirle, jump to the second rotor
    • after one input, the next input may generate the rotation of the second rotor (one position)
    • until three rotors finish their rotation -> original status -> again and again
  • utility: enigma machine (by German, with a 256-element rotor, use all rotors twice for each letter)
  • example

Modern Cryptography

DES (data encryption standard)

  • a block cipher with 56-bit key (64-bit including parity bits), 64-bit block
    • the reason for 64-bit: 64-bit is 8 bytes, each character uses 1 byte in the computer, 8 characters means we can use ascii code to represent
    • parity bits: just like checksum, solving the nosie problem, but today we may not use this
  • most commonly use block cipher
    • in 1990s, this method was not safe any more, then generate "triple DES" (1999) and AES (2001, use until now)
    • 3-DES
      • still use today, but someone thinks out of date
      • use this method due to orginial DES hardware can be used (save money)
  • based on "Feistel" network structure
    • means for both encryption and decryptionn use the same hardware (box)
    • DES use this structure for 16 times, but using different subkeys
      • method, ciphertext, etc. are public, but you cannot decrpyt it (due to lots of substitution and permutation)
    • mathmatics
      • \(L_{i+1}=R_i\)
      • \(R_{i+1}=L_i\bigoplus f(K_i,R_i)\)
      • inverse is the same
        • \(R_i=L_{i+1}\)
        • \(L_i=R_{i+1}\bigoplus f(K_i,L_{i+1})\)
      • f function
        • S-Box
          • public
          • fixed (6 bits -> 4 bits)
          • design criteria (early undisclosed)
            • the S-boxes were carefully tuned to increase resistance against differential cryptanalysis
            • no S-box is a linear or affine function of its input: the 4 output bits cannot be expressed as a system of linear equations of the 6 input bits
            • changing 1 bit in the input of an S-box results in changing at least 2 output bits: the S-boxes diffuse their information well throughout their outputs
            • the S-boxes were chosen to minimize the difference between the number of 1s and 0s when any single input bit is held constant: holding a single bit constant as a 0 or 1 and changing the bits around it should not lead to disproportionately many 0s or 1s in the output
        • P-Box
          • public
          • fixed
        • the only secret is the key, without it, you cannot decrypt the ciphertext
          • but it is not safe now, due to computing power (only \(2^{56}\) possible keys)
  • designed to facilitate hardware implementation

AES (advanced encryption standard)

  • mandates a block size of 128 bits and a choice of key size from 128, 192, 256 bits (called AES-128, AES-192, AES-256)
    • using 128-bit key (16 bytes, 4 4-octet)
  • diffusion and confusion
    • diffusion: by permutations in ShiftRows and MixColumns
    • confusion: by substitutions (S-Box) in SubBytes

Hash

  • cryptographic hash function
    • compression: for any size of input x, the output length of y = h(x) is small
    • features
      • efficiency: easy to compute h(x) for any input x
      • one-way: given any value y, it is computationally infeasible to find a value x such that h(x) = y
      • collision resistance
        • given x and h(x), if is infeasible to find y ≠ x such that h(x) = h(y)
        • it is infeasible to find x and y, with y ≠ x, such that h(x) = h(y)
    • common hash functions: MD5, SHA-1, SHA-256
      • basic structure of SHA-1
        • split message into 512-bit blocks
        • the former result would be the next input
        • if the memory is spare, padding
  • HMAC (hash message authentication code)
    • today, we use digital signature to verfity identity
    • in old days, we uese HMAC to verify indentity
    • the use of a shared secret key that provides authentication in addition to integrity protection
    • features
      • faster than encryption in software
      • use of any hash function is permitted in any export product from US
      • used in IPsec and TLS