A Multi-Model Approach to the Detection of Web-Based Attacks

Background Knowledge and Insights

Assumptions of Anomaly Detection

  • Relay on models of the intended behavior of user and application
  • Assume attack patterns is different from normal behavior
  • The difference can be expressed either quantitatively or qualitatively
  • Interprets deviations from this "normal" behavior as evidence

Web Security

  • Web servers accessible by outside world
  • Web apps developed with security as an afterthought
  • Misuse-based detection
    • Example: Snort
      • 1037 out of 2464 signatures
    • Hard to keep up-to-date
      • Time-intensive, error-prone, requires significant security expertise
    • Challenge with apps developed in-house
    • Unable to detect un-modeled attacks
  • Anomaly-based
    • Applicable to custom-developed web apps
    • Support detection of new attacks

Goals and Contributions

  • Unsupervised, learning-based anomaly detection
  • Deployed on host
  • A number of different models
    • Reduce vulnerability of mimicry attacks
  • Target specific types of applications
    • More focused analysis

Sample Date

Data set Time interval Size (MByte) HTTP queries Programs Program requests Attributes
Google 1 h 236 640,506 3 490,704 1,611,254
UCSB 297 days 1001 9,951,174 2 4617 7993
TU Vienna 80 days 251 2,061,396 8 713,500 765,399

Data model

Sample web server access log entry

  • An ordered set U={u1,u2,...,um}U=\{u_1,u_2,...,u_m\} of URIs
    • Extract from successful GET requests
    • 200return-code<300200 \leq \text{return-code} < 300
  • Components of uiu_i
    • Path to desired resource: pathipath_i
    • Optional path information: pinfoipinfo_i
    • Optional query string: qq
      • Following a ? Character
      • Passing parameters to referenced resource
      • Attributes and values: q=(a1,v1),(a2,v2),...,(an,vn)q = (a_1, v_1), (a_2, v_2), ..., (a_n, v_n)
      • AA, the set of all attributes, aia_i belongs to AA
      • viv_i, string
      • Sq={a1=v1,a2=v2,...,an=vn}S_q = \{a_1=v_1, a_2=v_2, ..., a_n=v_n\}
  • URIs without query strings not included in UU
  • UrU_r: subset of UU with resource path rr
    • Partition UU
    • Anomaly detection run independently on each UrU_r

Detection models

  • One query, one model
  • Alerting on a single anomalous attribute is necessarily cautious
  • Training mode
    • Create profiles for each server-side program and each of its attributes
    • Establish suitable thresholds
      • Store the highest anomaly score
      • Default thresholds: 10%10\% larger than the max anomaly score in training mode
  • Detection mode
    • Task: returns probability pp of normalcy
    • Anomaly Score, mModelswm(1pm)\sum_{m \in \text{Models}} w_m (1 - p_m)
      • Has an associated weight ww
        • Default value=1\text{value} = 1, in this paper

Attribute length

  • Length distribution not follow a smooth curve
  • Distribution has a large variance

User

  • Fixed size tokens
    • Session identifiers
  • Short input strings
    • Fields in an HTML form

Attacker

  • Buffer overflow: shell code and padding
    • Several hundred bytes
  • XSS

Learning

  • Estimate mean μ\mu and variance σ2\sigma^2 of lengths in training data

Detection

  • Strings with length larger than mean
    • If length<mean\text{length} < \text{mean}, p=1p = 1
    • Padding not effective
  • Chebyshev inequality is weak bound
  • Useful to flag only significant outliers

Attribute character distribution

  • Character distribution: sorted relative frequencies
    • Lost connection between individual characters and relative frequencies
      • passwd: 0.33, 0.17, 0.17, 0.17, 0.17, 0, ..., 0
      • As same as aabcde
    • Fall smoothly for human-readable tokens
    • Fall quickly for malicious input

User

  • Observations about attributes
    • Regular structure
    • Mostly human readable
    • Almost always contain only printable characters
  • Frequencies of query parameters distribution
    • Similar identical to a standard English text

Attacker

  • Repeated padding characters cause frequencies drop extremely fast
  • Example
    • Buffer overflow: needs to send binary data and padding
    • Directory traversal exploit: many . in attribute value

Learning

  • Character distribution of each observed attribute is stored
  • Average of all character distributions computed

Detection

  • Variant of the Pearson χ2-test\text{Pearson}\ \chi^2\text{-test}
    • Goodness-of-fit
  • Bins: {[0],[1,3],[4,6],[7,11],[12,15],[16,255]}\{[0], [1, 3], [4, 6], [7, 11], [12, 15], [16, 255]\}
  • For each query attribute:
    • Compute character distribution
    • Observed frequencies OiO_i : Aggregate over bins
    • Expected frequencies EiE_i : Learned character distribution attribute length
  • Compute: χ2=i=0i<6(OiEi)2Ei\chi^2 = \sum_{i=0}^{i<6} \frac{(O_i - E_i)^2}{E_i}
  • Read corresponding probability

Structural Inference

User

  • Parameter structure
    • Regular grammar describing all of its legitimate values

Attacker

  • Exploits requiring different parameter structure
    • Examples: Buffer overflow, directory traversal, XSS
  • Simple manifestations of an exploit
    • Unusually long parameters
    • Parameters containing repetitions of non-printable characters
  • Evasion
    • Replace non-printable characters by groups of printable characters

Learning

  • Seems to be "reasonable"
  • Stop before lost much structural information
  • Goal: Find a model(NFA) with highest likelihood given training examples
  • Markov model/Non-deterministic finite automaton (NFA)
    • "Reasonable generalization"
    • Ps(o)P_s(o): probability of emitting symbol oo at state SS
    • P(t)P(t): probability of transition tt
    • Output: paths from Start state to Terminalstate
  • Bayesian model induction
    • P(modeltraining_data)=p(training_datamodel)p(model)p(training_data)P(\text{model}\mid\text{training}\_\text{data}) = \frac{p(\text{training}\_\text{data}\mid\text{model})*p(\text{model})}{p(\text{training}\_\text{data})}
    • P(training_data)P(\text{training}\_\text{data}) a scaling factor; ignored
    • P(model)P(\text{model}): preference towards smaller models
      • Total number of states: NN
      • Total number of transitions at each state SS: T(S)T(S)
      • Total number of emissions at each state SS: E(S)E(S)
    • Start with a model exactly reflecting input data
    • Gradually merge states
    • Until posterior probability does not increase
    • Cost: O((nL)3)O((n*L)^3) with n training input strings, and L maximum length of each string
    • Up to nLn*L states
    • (nL)(nL1)2\frac{(n*L)(n*L-1)}{2} comparisons for each merging
    • Up to n(L1)n*(L-1) merges
  • Optimizations
    • Viterbi path approximations
    • Path prefix compression
    • Cost: O(nL2)O(n*L^2)

Detection

  • First option: Compute probability of query attribute
    • Issue: probabilities of all input words sum up to 11
    • all words have small probabilities
  • Output:
    • p=1p = 1 if word is a valid output of Markov model
    • p=0p = 0 otherwise

Token finder

  • Goal: determine whether values of an attribute are drawn from an enumerated set of tokens

User

  • Web applications often require one out of a few possible values for certain query attributes
    • Example: flags, indices

Attacker

  • Use these attributes to pass illegal values

Learning

  • Argument are enumerated or random value
    • r.v.: when the number of different argument instances grows proportional to the total number of argument instances
    • Enumeration: not exit that increase
  • Compute correlation ρ\rho between ff and gg
    • f(x)=xf(x) = x
    • g(x)g(x), gg is like a "enumeration counter"
      • g(x)=g(x1)+1g(x) = g(x-1)+1 if xthx^{th} value is new
      • g(x)=g(x1)1g(x) = g(x-1)-1 if xthx^{th} value was seen before
      • g(x)=0g(x) = 0 if x=0x = 0
    • Corr=Covar(f,g)Var(f)Var(g)Corr = \frac{Covar(f, g)}{\sqrt{Var(f)Var(g)}}
      • If Corr<0Corr < 0, then enumeration
      • If enumeration, then store all values for use in detection phase

Detection

  • If enumeration: value expected to be among stored values
    • Output p=1p = 1 or p=0p = 0 correspondingly
    • Hash table lookup, efficiency
  • If random: p=1p = 1

Attribute presence or absence

User

  • URIs typically produced not directly by user, but by scripts, forms, client-side programs
    • Regularity in number, name, order of parameters

Attacker

  • Hand-crafted attacks typically break this regularity
    • Incomplete or malformed requests to probe/exploit web app
      • Missing argument
      • Mutually exclusive arguments appearing together

Learning

  • Create a model of acceptable subsets of attributes
  • Record each distinct set SqS_q
    • Each query qq during training
    • Hash table

Detection

  • Lookup the attribute set in hash table
    • Return p=1p = 1 or p=0p = 0 correspondingly

Attribute order

User

  • Same parameters in the same order
  • Sequential program logic preserves order even when some attributes left out

Attacker

  • Arbitrary
  • No influence on the execution of the program

Learning

  • Attribute asa_s precedes ata_t
    • asa_s, ata_t appear together in at least one query
    • asa_s comes before ata_t when they appear together
  • Directed graph, GG
  • Vertex viv_i corresponds to attribute aia_i
    • #\# vertices = #\# attributes
  • For each training query, add edges between nodes of ordered attribute pairs
    • Directed edges correspond to orders of attribute pairs
  • Find all strongly connected components (SCC) of the graph
  • Remove edges between nodes in same SCC
    • Remove "order"s from disordered attributes groups
  • For each node, find all reachable nodes
  • Add corresponding pairs (as,at)(a_s, a_t) to set of precedence orders OO
  • Tarjan's algorithm, O(v+e)O(v + e)

Detection

  • Find all order violations
    • Return p=0p = 0 or p=1p = 1 correspondingly

Access frequency

  • Frequency patterns of different server-side web applications
  • Two types of frequencies
    • Frequency of application being accessed from a certain client
      • Base on IP address
    • Total frequency of all accesses

User

  • Eg 1. Authentication script
    • Very infrequently for each individual client
  • Eg 2. Search script
    • High for particular client
    • Low for total

Attacker

  • Suddenly increase access frequency
  • Probing
  • Guess parameter values
  • Evasion: slow down

Learning

  • Divide training time to intervals of fixed time (e.g., 10 sec)
  • Count accesses in each interval
  • Find total and client-specific distributions

Detection

  • Chebyshev probability for total, and for client
  • Return average of the two probabilities

Inter-request time delay

User

  • Wide variance

Attacker

  • Regular delay between each successive request
    • Surveillance
    • Scripted probe

Learning

  • Find distribution of normal delays
    • Similar to character distribution model

Detection

  • Pearson χ2-test\text{Pearson}\ \chi^2\text{-test}

Invocation order

  • Order of invocation of web-based applications for each client
    • Infer session structure regularity
    • Similar to structural inference model
      • Sequence of queries, not
      • Parameters syntax of a query

User

  • Invocation order can be generated by a certain Markov model

Attacker

  • Can be detected when attacking application logic
  • Cannot be outputed by that model

Learning

  • Group queries based on source IP
    • Session: Queries within an interval of time
    • Build NFA for sessions
  • Independent of server-side applications

Detection

  • p=1p = 1 or p=0p = 0 depending on session being an output of NFA

Limitation

  • Google, nearly half of the number of false positives
    • Anomalous search strings that contain instances of non-printable characters
      • Probably requests issued by users with incompatible character sets
    • Extremely long strings
      • Such as URLs directly pasted into the search field

References

  • A multi-model approach to the detection of web-based attacks, 2005
  • CS 259D Lecture 8

results matching ""

    No results matching ""