print · source · login   

Background: Model Learning

For the first homework assignment, you may choose between two problems: the more practical Assignment 1a and the more theoretical Assignment 1b. Please inform the instructors on which problem you want to work and with whom (you may work alone or with somebody else). The deadline for submitting your solution is April 19. If at any point you get stuck, please discuss your problems with the instructors!!! In case you work together with someone else, your report should contain a paragraph in which the contribution of each of the team members is described.

Model learning is emerging as an effective technique to construct black-box state machine models of hardware and software components. A recent review article in Communications of the ACM gives a general introduction in the theory and application of this technique (you don't have to read this article, however, to make this assignment). When the learned state machine models become larger, it makes sense to use model checking to check if they satisfy some desired properties. In a recent article:

  • Paul. Fiterau-Brostean, Toon Lenaerts, Erik Poll, Joeri de Ruiter, Frits Vaandrager, and Patrick Verleg. Model Learning and Model Checking of SSH Implementations. In Proceedings 24th ACM SIGSOFT International SPIN Symposium on Model Checking of Software, 13-14 July 2017, Santa Barbara, CA, USA, pages 142-151.

we used model learning to infer state diagrams of three SSH implementations, and then NuSMV2 model checking to verify that these models satisfy basic security properties and conform to the RFCs. The analysis confirmed that all tested SSH server models satisfy the stated security properties, but uncovered violations of the standard in all three implementations.

Assignment 1a

In 2015, colleagues from Radboud University published an article in which they apply model learning (or protocol state fuzzing, as they call it) to obtain models of TLS implementations:

Joeri and Erik found new security flaws in three of the implementations which they considered. They did not use model checking, however.

Your task is to analyze the learned TLS models of De Ruiter & Poll fully automatically using the model checking tool NuSMV, following the approach from the SSH paper: (a) formalize the TLS security requirements formulated by De Ruiter & Poll in LTL, (b) extract as many other functional correctness requirements from the TLS RFCs as possible (of course they should relate to the aspects of the protocol that are captured by the model), (c) use NuSMV to check whether the properties hold (and if not give a counterexample). It is possible that LTL is not sufficiently expressive to formalize some requirements. In this case, can you think of an alternative way to formalize and check these requirements? Can they be expressed in CTL?

The TLS RFCs are

  • DIERKS, T., AND ALLEN, C. The TLS protocol version 1.0. RFC 2246, Internet Engineering Task Force, 1999.
  • DIERKS,T.,AND RESCORLA,E. The Transport Layer Security (TLS) protocol version 1.1. RFC 4346, Internet Engineering Task Force, 2006.
  • DIERKS,T.,AND RESCORLA,E. The Transport Layer Security (TLS) protocol version 1.2. RFC 5246, Internet Engineering Task Force, 2008.

The Mealy machine models that Joeri and Erik learned are available in .dot format via this page. Paul Fiterau was so kind to translate these models to NuSMV format. Actually, two of the models described by Joeri and Erik were not on the web but we do have NuSMV translations: JSSE_1.8.0_25_server_regular.smv and JSSE_1.8.0_31_server_regular.smv. Due to restrictions of the NuSMV input format he had to replace some special symbols with other symbols, e.g. "&" by "#".

You have to submit a report in which you clearly describe and carefully justify your formalization of the requirements in NuSMV and the results of your model checking experiments. In addition, you also have to submit a NuSMV input file containing all your LTL formulas.

Assignment 1b

In the SSH case study of Fiterau et al, both future and past modalities are used to formalize requirements of the protocol. Past modalities are supported by NuSMV but are not discussed in the textbook of Baier and Katoen. Since past modalities often lead to shorter and more intuitive specifications, this could be considered as a shortcoming. Your task is to write a new section, (hopefully) to be included in Chapter 5 of the second edition by Baier and Katoen, in which LTL with past modalities is discussed. In order to write this section, you may use (in particular) information from the following publications:

Your new section should be written in the same style as Baier+Katoen and should discuss at least discuss the following topics:

  • The syntax and semantics of LTL with past modalities
  • The relation between the LTL semantics from section 5.1.2 and the semantics of LTL with past modalities
  • An algorithm or collection of rewrite rules to transform each LTL formula with past modalities into an equivalent LTL formula without past modalities; illustrated on the LTL with past formulas from the SSH-paper.

If the new section only adresses the above topics you get at most an 8. You may earn the maximal score of 10 if in addition you manage to (partially) solve one or more of the following questions:

  • From the work of Markey we know that in general temporal logic with past is exponentially more succinct than LTL. But what is the situation for the specific formulas used in the SSH case study. Are the formulas with past operator from the SSH case more succinct than any equivalent LTL formula? (Proof?) Can the LTL formulas from the SSH case be expressed more succintly using past modalities?
  • All the (regular and past time) LTL formulas from the SSH case study denote safety properties. When an LTL formula does not hold for some model, NuSMV produces an infinite counterexample, finitely represented as a "lasso", a finite prefix followed by a loop. In the case of safety properties, however, it is much more informative to provide as counterexample a minimal "bad prefix" that shows that the safety property is violated (see Chapter 4 of Baier and Katoen). Algorithms for computing minimal bad prefixes have been published, but unfortunately these algorithms have not been implemented as part of the NuSMV tool. Suppose that S is an LTL formula that denotes a safety property, suppose that M is a transition system model for which S does not hold, and suppose v w^omega is a counterexample. Can you give an natural number n such that v w^n is a bad prefix? The value of n may of course depend on the size of M and S. Timo Latvala has build a tool to translate safety LTL formula to finite automata. The tool is no longer maintained and the latest version is from 2004. But suppose we could use it. Could this help to compute a minimal bad prefix starting from an infinite counterexample generated by NuSMV? Relevant papers (background info):