On the Infeasibility of Modeling Polymorphic Shellcode

Background Knowledge and Insights

Shellcode

  • Traditional shellcode structure
    • [nop][payload][retaddr]
  • Encrypted shellcode structure
    • [nop][decoder][encpayload][retaddr]
  • Modern obfuscation technique
    • Automatically re-write code
      • Hard(NP-complete) to decompose to graph
    • Code-obfuscation
      • Encrypt
      • Dynamically decrypt at runtime
      • Remain code obfuscated at transmission

Signature Matching

  • String-based signatures
    • Snort
  • Detection heuristics
    • Frequency of various packet types
  • Identification of NOP sled
  • Signature from the actual actual exploit code
  • Statistical measures of packet content

Decoder and Decoder Detector

  • Look for decoder rather than payload
  • How well the decoder can be hidden
    • Rearranging and randomizing the order of the individual ciphers components
    • Randomly chosen keys
    • Insert junk instructions
  • Decoder
    • Components
      • Modification operation
        • add, sub, xor, etc.
      • Loop component
        • jmpz
    • Maintenance behaviors, more than "decoding"
      • Clear register
      • Multiple cipher operation
      • Calculate the location of the executable

Goals and Contributions

  • Describe shellcode and code obfuscation techniques
  • Measure the strengths of polymorphic engines
  • Introduce a hybrid engine
  • Proved: Given any normal statistical model, there is a significant probability that an attacker can craft successful targeted attacks against it.

Polymorphic Engine Analysis

Notation

  • nn, #\# string bytes
  • NN, #\# samples
  • x\mathbf{x} (y\mathbf{y}), set of column vectors (samples)
  • xi\mathbf{x}_i (xj\mathbf{x}_j), i,j=1...Ni, j = 1 ... N, the ithi^{th} (jthj^{th}) vector (sample) in the set x\mathbf{x}
  • x(i)\mathbf{x}(i), the ithi^{th} component of the vector x\mathbf{x}

Problem Definition

  • Given nn bytes, there exist 256n256^n possible strings
  • x86 code of length nn is a subspace
  • How difficult is it to model this subspace?

Measures

Spectral Image

visualization of shellcode variations

  • DD decoders of length NN
  • Compile into D×ND \times N matrix
  • Display matrix as image
    • 0x00-0xFF: black-white

Minimum Euclidian distance

  • Intuition: Decoders can shift order of operations
  • String x\mathbf{x} as point in nn-dimensional Euclidian space
    • Example (2D): "ab" (97,98)\rightarrow (97, 98)
  • Minimum Euclidian Distance:
    • Minimum normalized distance between two points under arbitrary byte-level rotations
    • rot(y,r)rot(\mathbf{y}, r), rotate the string y\mathbf{y} to the left by rr-bytes, with wraparound

δ(x,y)=min1rn{x=rot(y,r)x+y}\delta(\mathbf{x}, \mathbf{y}) = \min_{1 \leq r \leq n}\{\frac{\parallel \mathbf{x} = rot(\mathbf{y}, r) \parallel}{\parallel \mathbf{x} \parallel + \parallel \mathbf{y} \parallel}\}

Variation strength

  • Magnitude of the space covered by span of points in nn-space corresponding to detectors
  • Decoders x1,x2,...,xNx_1, x_2, ..., x_N in nn-space
  • λ1,λ2,...,λn\lambda_1, \lambda_2, ..., \lambda_n, eigenvalues of covariance matrix
  • Variation strength:

Ψ(engine)=1di=1dλi\Psi(\text{engine}) = \frac{1}{d}\sum_{i = 1}^{d}{\sqrt{\lambda_i}}

Propagation strength

  • Worst case of variation strength measure
    • Decoder is distributed on a hollow nn-dimensional sphere with a large radius
  • Efficacy in making sample pairs different
  • Consider fully connected graph with decoders as nodes
    • Edge weight == minimum Euclidian distance
    • Average edge weight == propagation strength
  • η=#\eta = \# salient bytes in samples
  • δ(x,y)\delta(\mathbf{x}, \mathbf{y}), distance between two samples
    • Flexible selection of distance function δ\delta
  • Default: uniform prior, p(δ())=1p(\delta(\cdot)) = 1
  • Propagation strength:

Φ(engine)=(1ηn)p(δ(x,y))δ(x,y)dxdy\Phi(engine) = (1 - \frac{\eta}{n})\int\int p(\delta(\mathbf{x}, \mathbf{y}))\delta(\mathbf{x}, \mathbf{y})d\mathbf{x} d\mathbf{y}

Overall strength

  • Measure polymorphic engine:

Π(engine)=Ψ(engine)Φ(engine)\Pi(\text{engine}) = \Psi(\text{engine}) \cdot \Phi(\text{engine})

Hybrid Engine: Combining Polymorphism and Blending

  • CLET: byte distribution blending
  • ADMutate: Polymorphism
    • Random looking decoder, recursive NOP sled
  • Combine CLET and ADMutate
    • Blend in with normal traffic
    • Blending bytes can be randomly permuted
    • RETADDR can be added with a random offset
    • 4-byte salient artifact too small to use as a signature
    • Essentially impossible to model

References

  • On the Infeasibility of Modeling Polymorphic Shellcode, Song et al, 2007
  • On the Infeasibility of Modeling Polymorphic Shellcode Re-thinking the role of learning in intrusion detection systems, Song et al, 2009
  • CS 259D Session 14

results matching ""

    No results matching ""