On the Infeasibility of Modeling Polymorphic Shellcode
Background Knowledge and Insights
Shellcode
- Traditional shellcode structure
- Encrypted shellcode structure
- [nop][decoder][encpayload][retaddr]
- Modern obfuscation technique
- Automatically re-write code
- Hard(NP-complete) to decompose to graph
- Code-obfuscation
- Encrypt
- Dynamically decrypt at runtime
- Remain code obfuscated at transmission
Signature Matching
- String-based signatures
- Detection heuristics
- Frequency of various packet types
- Identification of NOP sled
- Signature from the actual actual exploit code
- Statistical measures of packet content
Decoder and Decoder Detector
- Look for decoder rather than payload
- How well the decoder can be hidden
- Rearranging and randomizing the order of the individual ciphers components
- Randomly chosen keys
- Insert junk instructions
- Decoder
- Components
- Modification operation
- Loop component
- Maintenance behaviors, more than "decoding"
- Clear register
- Multiple cipher operation
- Calculate the location of the executable
Goals and Contributions
- Describe shellcode and code obfuscation techniques
- Measure the strengths of polymorphic engines
- Introduce a hybrid engine
- Proved: Given any normal statistical model, there is a significant probability that an attacker can craft successful targeted attacks against it.
Polymorphic Engine Analysis
Notation
- n, # string bytes
- N, # samples
- x (y), set of column vectors (samples)
- xi (xj), i,j=1...N, the ith (jth) vector (sample) in the set x
- x(i), the ith component of the vector x
Problem Definition
- Given n bytes, there exist 256n possible strings
- x86 code of length n is a subspace
- How difficult is it to model this subspace?
Measures
Spectral Image
- D decoders of length N
- Compile into D×N matrix
- Display matrix as image
Minimum Euclidian distance
- Intuition: Decoders can shift order of operations
- String x as point in n-dimensional Euclidian space
- Example (2D): "ab" →(97,98)
- Minimum Euclidian Distance:
- Minimum normalized distance between two points under arbitrary byte-level rotations
- rot(y,r), rotate the string y to the left by r-bytes, with wraparound
δ(x,y)=min1≤r≤n{∥x∥+∥y∥∥x=rot(y,r)∥}
Variation strength
- Magnitude of the space covered by span of points in n-space corresponding to detectors
- Decoders x1,x2,...,xN in n-space
- λ1,λ2,...,λn, eigenvalues of covariance matrix
- Variation strength:
Ψ(engine)=d1∑i=1d√λi
Propagation strength
- Worst case of variation strength measure
- Decoder is distributed on a hollow n-dimensional sphere with a large radius
- Efficacy in making sample pairs different
- Consider fully connected graph with decoders as nodes
- Edge weight = minimum Euclidian distance
- Average edge weight = propagation strength
- η=# salient bytes in samples
- δ(x,y), distance between two samples
- Flexible selection of distance function δ
- Default: uniform prior, p(δ(⋅))=1
- Propagation strength:
Φ(engine)=(1−nη)∫∫p(δ(x,y))δ(x,y)dxdy
Overall strength
- Measure polymorphic engine:
Π(engine)=Ψ(engine)⋅Φ(engine)
Hybrid Engine: Combining Polymorphism and Blending
- CLET: byte distribution blending
- ADMutate: Polymorphism
- Random looking decoder, recursive NOP sled
- Combine CLET and ADMutate
- Blend in with normal traffic
- Blending bytes can be randomly permuted
- RETADDR can be added with a random offset
- 4-byte salient artifact too small to use as a signature
- Essentially impossible to model
References
- On the Infeasibility of Modeling Polymorphic Shellcode, Song et al, 2007
- On the Infeasibility of Modeling Polymorphic Shellcode Re-thinking the role of learning in intrusion detection systems, Song et al, 2009
- CS 259D Session 14