Active Automata Learning: From DFA to Interface Programs and Beyond or From Languages to Program Executions or (more technically) The Power of Counterexample Analysis Bernhard Steffen, Falk Howar, Malte Isberner TU Dortmund /CMU B. Steffen Summer School CPS 2014 1
Connect Scenario some service connector interrogate learn X try to use inform about new service and device learner CONNECT environment interrogate look for known models B. Steffen Summer School CPS 2014 2
Data-Dependent Control Value-independent Data Dependencies B. Steffen Summer School CPS 2014 3
How to Extend w. Data? Data is crucial for modeling Interface specifications relate data in input to data in subsequent output Communication protocols sequence numbers, identifiers,.. (External) Mapper-Based Data Treatment Explicit Data Modelling B. Steffen Summer School CPS 2014 4
Outline Background Manual Treatment of Data Automated Alphabet Abstraction Refinement Modelling Data Explicitly Conclusions B. Steffen Summer School CPS 2014 5 5
Computer/Telephony Integrated Systems LAN Switch Model-Generator Application-PCs ISDN Network Application- Server B. Steffen Summer School CPS 2014 6
The Concrete Scenario Test Coordinator Rational Robot Hipermon Hipermon HTTP CSTA II/III PCM Application PCs HTTP Hipermon Hipermon PCM Application Server B. Steffen Summer School CPS 2014 7
Means of Observation Hipermon Test Coordinator Hipermon ^ HTTP Rational Robot CSTA II/III PCM Application PCs HTTP Hipermon Hipermon PCM Application Server (small) learned models imposed major test suite optimizations B. Steffen Summer School CPS 2014 8
Moderated, Regular Extrapolation l Extrapolation Hypothesis Building beyond known facts l Regular Extrapolation-Universe: Extended Finite Automata l Moderated The Extrapolation Process requires targeted interaction Neither Correct nor Complete! B. Steffen Summer School CPS 2014 9
Models in our Scenario Abstract representation of the protocol-level behaviour. Abstraction typically concerns { invokeid = 58391, replace operation-value = 21 (cstaeventreport), {eventspecificinfo.....hookswitch symbolic names details l i no time stamps etc. {deviceid.dialingnumber = 500 hookswitchonhook= TRUE,... timestamp = 20001010095551 } }}} {obsevent deviceid = A1 switchonhook,... }} B. Steffen Summer School CPS 2014 10
Sketch of the Model Structure Models comprise state changes as well as UPN- and CSTA-Observations. Sys_Info obs_csta upnoffhook obs_csta obs_csta obs_csta Sys_Info { {deviceid = A1 hookswitchonhook,... }} device A1 display(line 1,...) LEDs: (1,on) (2,off)...... B. Steffen Summer School CPS 2014 11
Transitions Reaching Words Active Automata Learning OT Distinguishing Futures Lower Hypothesis Automaton Unknown System Closeness & Consistency Validation B. Steffen Summer School CPS 2014 12
Membership Queries OT 1 Abstract States Unknown System a 1 b 0 Not closed! Transition Relation B. Steffen Summer School CPS 2014 13
Closure & Consistency OT 1 b 0 Unknown System a 1 ba 0 bb 0 a Closed & Consistent b a,b B. Steffen Summer School CPS 2014 14
Equivalence Queries OT 1 b 0 a 1 Unknown System ab 1 a 1 a ba 0 bb 0 b a,b Counterexample: ab L B. Steffen Summer School CPS 2014 15
Counter Example-Based Extension OT 1 b 0 a 1 ab 1 a Unknown System ba 0 bb 0 aa 0 aba 0 abb 1 b a,b Counterexample: ab L B. Steffen Summer School CPS 2014 16
Closure & Consistency OT 1 b 0 a 1 ab 1 ba 0 bb 0 Not consistent: row ( ) = row (a), but row ( a) row (aa) Unknown System aa 0 aba 0 abb 1 New Column: a B. Steffen Summer School CPS 2014 17
Next Iteration OT a 1 1 b 0 0 a 1 0 Unknown System ab 1 0 ba 0 0 bb 0 0 Closed & Consistent aa 0 0 aba 0 0 abb 1 0 B. Steffen Summer School CPS 2014 18
Next Iteration OT a 1 1 b 0 0 a 1 0 ab 1 0 ba 0 0 bb 0 0 aa 0 0 aba 0 0 abb 1 0 b a a,b a Unknown System b Finished! B. Steffen Summer School CPS 2014 19
Active automata learning: L* a Σ={a,b} b b a a a b b aba L? MQ-Oracle no a a,b b? no, bb L! EQ-Oracle B. Steffen Summer School CPS 2014 20
Summary of L* algorithm L* infers Finite State Machine from queries: 1. Pose membership queries until saturation 2. Construct Hypothesis from obtained information 3. Pose equivalence query 4. if no look at counterexample and goto 1 5. else return Hypothesis end Has been used to learn large automata ( 100 kstates) Adapted for Mealy Machines [Niese et al. 2003] and for Interface Automata [Aarts et al. 2010] Efficient Tool: LearnLib [TUDortmund] B. Steffen Summer School CPS 2014 21
Summary of L* algorithm L* infers Finite State Machine from queries: 1. Pose membership queries until saturation 2. Construct Hypothesis from obtained information 3. Pose equivalence query 4. if no look at counterexample and goto 1 5. else return Hypothesis end Has been used to learn large automata ( 100 kstates) Adapted for Mealy Machines [Niese et al. 2003] and for Interface Automata [Aarts et al. 2010] Efficient Tool: LearnLib [TUDortmund] B. Steffen Summer School CPS 2014 22
Analysis of Counterexamples I one essential suffix All prefixes of counterexample a b bb ε 0 0 a 1 1 b 1 1 bb 0 0 bbb 0 0 aa 1 1 ab 1 1 ba 0 0 B. Steffen Summer School CPS 2014 23
Analysis of Counterexamples I one essential suffix All prefixes of counterexample a b bb ε 0 0 a 1 1 b 1 1 bb 0 0 bbb 0 0 aa 1 1 ab 1 1 ba 0 0 Essential suffix B. Steffen Summer School CPS 2014 24
Effect: Reduced Observation Table Rivest and Shapire: Analyze counterexample separately (not in the table) Only add one essential suffix (i.e., witness), as column label to the table Consequence: Guaranteed Consistency! Improved worst case complexity BUT: Hypothesis Automata are no longer guaranteed to be minimal! (cf. Pnueli / Mahler s criticism) B. Steffen Summer School CPS 2014 25
Outline Background Manual Treatment of Data Automated Alphabet Abstraction Refinement Modelling Data Explicitly Conclusions B. Steffen Summer School CPS 2014 26 26
Simple Stack finite capacity B. Steffen Summer School CPS 2014 27
Mappers B. Steffen Summer School CPS 2014 28
Learning the stack as a language push, pop L, L stack.push(1) stack.pop() true, false, null, 1 B. Steffen Summer School CPS 2014 29
Introducing outputs: Mealy machines push, pop OK, NOK, null, 1 stack.push(1) stack.pop() true, false, null, 1 B. Steffen Summer School CPS 2014 30
Introducing outputs: Mealy machines push1, push2, pop stack.push(1) Stack.push(2) stack.pop() OK, NOK, null, 1, 2 true, false, null, 1, 2 B. Steffen Summer School CPS 2014 31
Outline Background Manual Treatment of Data Automated Alphabet Abstraction Refinement Modelling Data Explicitly Conclusions B. Steffen Summer School CPS 2014 32 32
Automated Alphabet Abstraction Refinement Learning setup in Practice <presence type= /> Available <iq type= result /> Test-driver Static alphabet abstraction OK LearnLib B. Steffen Summer School CPS 2014 33
Automated Alphabet Abstraction Refinement <presence type= /> Available(type=avail ) <iq type= result /> Test-driver OK LearnLib Learning relative to a given representation system Available Available(type=avail ) Available Available(type=unavail ) Non-det. during EQ Test CEGAR teacher <presence type= /> Available <iq type= result /> Test-driver Static alphabet abstraction OK LearnLib B. Steffen Summer School CPS 2014 34
The Mod-k Stack finite set of outputs, e.g.: odd / even push, push, pop OK, NOK, null, odd, even stack.push(51); stack.push(2012); stack.pop() true, false, null, 51, 2012 B. Steffen Summer School CPS 2014 35
The Mod-k Stack finite set of outputs, e.g.: odd / even push push pop / odd push push pop / even push, push, pop OK, NOK, null, odd, even stack.push(51); stack.push(2012); stack.pop() true, false, null, 51, 2012 B. Steffen Summer School CPS 2014 36
Counter Examples and Witnesses c 1 c 2 c 3 c 4 c 5 c 6 γ(α(c 1 )) γ(α(c 2 )) γ(α(c 3 )) γ(α(c 4 )) γ(α(c 5 )) γ(α(c 6 )) Bern hard B. Steffen Summer School CPS 2014 37
Counter Examples and Witnesses c 1 c 2 c 3 c 4 c 5 c 6 c 5 c 6 γ(α(c 1 )) γ(α(c 2 )) γ(α(c 3 )) c 4 γ(α(c 4 )) c 5 c 6 γ(α(c 1 )) γ(α(c 2 )) γ(α(c 3 )) γ(α(c 4 )) γ(α(c 5 )) γ(α(c 6 )) Bern hard B. Steffen Summer School CPS 2014 38
Counter Examples and Witnesses c 5 c 6 γ(α(c 1 )) γ(α(c 2 )) γ(α(c 3 )) c 4 d p γ(α(c 4 )) c 5 c 6 Separating pattern p c 4 d state representation future B. Steffen Summer School CPS 2014 39
Alphabet Abstraction Refinement Σ C Σ C \ α old (c) c push γ(α(p)) x d = γ(α(p)) c d α old (c) γ old (α old (c)) push B. Steffen Summer School CPS 2014 40
Case Study Biometric Passport [Aarts et. al, 2010] 262 Concrete symbols, 256 x readfile(i). - 1 initial abstract symbols - 8 alphabet refinements, to split readfile - 9 final abstract symbols read file(i) aggregated according to the required authentication Bernhard Steffen VMCAI 2011 @ Austin, Texas B. Steffen Summer School CPS 2014 41
Outline Background Manual Treatment of Data Automated Alphabet Abstraction Refinement Modelling Data Explicitly Conclusions B. Steffen Summer School CPS 2014 42 42
How to Extend with Data? Data is crucial for modeling Interface specifications relate data in input to data in subsequent output Communication protocols sequence numbers, identifiers,.. Extend automaton model Data parameters in actions State variables to remember parameter values How to extend the learning techniques? 43 B. Steffen Summer School CPS 2014 43
Register Automata B. Steffen Summer School CPS 2014 44
Relation: Data Languages B. Steffen Summer School CPS 2014 45
The Impact of Register Automata Query: push(p 1 )/OK push(p 2 )/OK pop()/p 2 push(p)/ok, pop()/o(p), L, L stack.push(51); stack.push(2012); stack.pop() true, false, null, 51, 2012 B. Steffen Summer School CPS 2014 46
A Data-Aware Nerode-Relation B. Steffen Summer School CPS 2014 47
Reusing structure of L* B. Steffen Summer School CPS 2014 48
Analysis of Counterexamples III Counterexample Analysis for inferring New locations New registers New transitions B. Steffen Summer School CPS 2014 50 50
CE: New location B. Steffen Summer School CPS 2014 51
CE: New location B. Steffen Summer School CPS 2014 52
CE: New location B. Steffen Summer School CPS 2014 53
CE: New location B. Steffen Summer School CPS 2014 54
CE: New location B. Steffen Summer School CPS 2014 55
CE: New location B. Steffen Summer School CPS 2014 56
CE: New register B. Steffen Summer School CPS 2014 57
CE: New register B. Steffen Summer School CPS 2014 58
CE: New register B. Steffen Summer School CPS 2014 59
CE: New register B. Steffen Summer School CPS 2014 60
CE: New transition B. Steffen Summer School CPS 2014 61
CE: New transition B. Steffen Summer School CPS 2014 62
CE: New transition B. Steffen Summer School CPS 2014 63
CE: New transition B. Steffen Summer School CPS 2014 64
Experimental Evaluation B. Steffen Summer School CPS 2014 65
Modeling Output explicitly: RMMs RA RMM is in language Example: Stack of capacity 3 RA: output encoded as guarded transition RMM: output with data for transitions leads to output B. Steffen Summer School CPS 2014 66
RMM: Explicit Output B. Steffen Summer School CPS 2014 67
RMM: Explicit Output Query: push(p 1 )push(p 2 )pop() / p 2 push(p), pop() OK, NOK, null, p stack.push(51) stack.push(2012) stack.pop() true, false, null, 51, 2012 B. Steffen Summer School CPS 2014 68
Inferring RMMs Example: Nested stack of capacity 16 RMM: 781 locations, 45k MQ, 9 EQ, 20 sec. Mealy, D =4: > 10 9 states B. Steffen Summer School CPS 2014 69
Outline Background Manual Treatment of Data Automated Alphabet Abstraction Refinement Modelling Data Explicitly Conclusions B. Steffen Summer School CPS 2014 70 70
Conclusions and Perspectives Main Practical Challenges are Search for Counterexamples Counterexample Analysis Question: How much can counter examples tell about a system? We have seen scenarios for (beside the classical locations), Optimal Alphabet Abstraction Optimal Register Allocation Optimal Transition Functions We have seen how to get From DFA to Interface Programs or From Languages to Program Executions B. Steffen Summer School CPS 2014 71 71
Conclusions and Perspectives Beyond: Investigation of language extensions Extended Guards Actions with Effect Procedural Structure? Hybrid Approaches and Case Studies Experimental Evaluation and Performance Analysis The RERS Greybox Challenge 2014 B. Steffen Summer School CPS 2014 72 72