# Topics for Master's theses in our group

This page lists potential Master's thesis topics related to the research done in our group. If you find any of the topics interesting and feel qualified to pursue such a thesis, be sure to get in touch with us. Also, if you have any thesis ideas of your own that you think would fit to our group, and want to be supervised, you are encouraged to contact us!

**NB:** our capacity to supervise Master students is limited unfortunately. This page collects possible topics in our group, but we cannot handle too many projects at the same time, and sometimes might have to decline supervising some projects because of this.

# Real-time Systems

### Measurement-based WCET estimation in multicore real-time systems

Modern multicore processors provide high performance at a reasonable cost. However, the complex “inter-task” interference caused by simultaneous access to shared resources, such as cache and memory, makes the estimation of programs' worst-case execution time (WCET) too difficult.

The goal of the project is to set up a platform which enables WCET estimation through measurement of real executions. In particular, the influence of different factors need to be taken into account, including: cache and memory access of the tasks running on different cores, preemption and scheduler overheads, and program input data. To achieve this, a number of steps is suggested to follow:

- Setting up a multicore processor system where
- different real-time scheduling policies are available (and ideally, new ones can be implemented), and
- the execution time of the jobs can be measured in a reasonably high accuracy.

- Designing and performing a set of experiments for WCET estimation, by effective state-space exploration.
- Comparing the results with simulation-based ones (for instance, Gem5’s).
- Comparing the results with those obtained from static analysis (
**optional**).

As a starting point, one can utilize the Linux operating system, along with a partitioned scheduling approach that is implemented by processor affinity assignment (1). The next step would be to improve this by employing a real-time operating system, such as FreeRTOS (2).

Contact: Morteza Mohaqeqi, Wang Yi

### Complexity of Dual Priority Scheduling

In 1993 Burns and Wellings proposed a scheduling algorithm called Dual Priority scheduling, which is similar to Fixed-Priority (FP) scheduling except that each task is assigned two priority levels instead of one. Each job from the task starts out with the first priority, but after a fixed offset from it's release changes to the other priority. Dual Priority scheduling was conjectured to be optimal in some settings (implicit-deadline periodic tasks on a single processor). While this conjecture has since been disproved, it could still offer greatly improved schedulability in practice compared to FP, while having some implementation benefits compared to EDF.

However, the space of possible configuration options (priorities and the offset points where priorities are switched) grows out of hand very quickly, and there is no known simple formula for finding a "best" configuration. Also, there is no better exact schedulability test known for a given configuration than simply simulating the hyper period.

The computational complexity of the Dual Priority schedulability problem is not fully understood. For implicit-deadline periodic tasks on a single processor we know the following.

- Given a particular configuration (priorities and offsets), the problem to decide if a task set is Dual Priority schedulable is in PSPACE and is NP-hard.
- The problem of deciding if any schedulable configuration exists is in PSPACE, but no lower bounds are known.

The aim of this project is to improve on one or both of the above.

Some related reading:

- DUAL PRIORITY ASSIGNMENT: A Practical Method For Increasing Processor Utilisation by Burns and Wellings
- Dual priority scheduling: Is the processor utilisation bound 100%? by Burns
- A Practical Sub-Optimal Solution for the Dual Priority Scheduling Problem by Fautrel, George, Goossens, Masson and Rodriguez
- Real-Time Scheduling on Uni- and Multiprocessors based on Priority Promotions by Pathan
- Dual Priority Scheduling is Not Optimal by Ekberg

Contact: Pontus Ekberg

Prerequisites: familiarity with algorithms, real-time scheduling and complexity theory.

# Security

### Try to break DOMPurify

DOMPurify is "a DOM-only, super-fast, uber-tolerant XSS sanitizer for HTML, MathML and SVG”. Your job is to work with us to encode the logic of DOMPurify as constraints to investigate if we can prove it to be safe against XSS attacks, or even find counterexamples.

Note: this project can easily be recast to "bring your own web validation".

Contact: Amanda Stjerna (**applications closed, sorry!**)

### Web Validation and Sanitisation Zoology

We are working on technologies to enhance the security of web applications by analysing their handling (validation and sanitisation) of strings. However, a big portion of our work is evaluated using less realistic examples and inputs. Your job will be to investigate real-world input validation and sanitisation in web applications and identify common frameworks, libraries and patterns. The best possible outcome would be a representative study of input validation and sanitisation in modern web applications. Possible methods include crawling web pages to identify common frameworks and shared code or scraping GitHub to isolate e.g. common validation regular expressions.

This project would be particularly good for someone with previous experience of front-end development and/or open source development and the open source ecosystem in general. That's not required though!

Extra options include working with us to translate your findings into runnable benchmarks for our software.

Contact: Amanda Stjerna (**applications closed, sorry!**)

# Applied Machine Learning

### Exploring properties and limitations of Graph Neural Networks (GNNs) in a security perspective view.

Graphs are universal structures of real-world complex systems. Because of strong representation learning capacity, GNNs such as graph convolutional network (GCN), graph attention network (GAT), gated graph neural network (GGNN) have brought ground-breaking performance in areas, ranging from disease diagnosis and drug discovery to recommendation systems. However, despite GNNs revolutionizing graph representation learning, there is a limited understanding of their representational properties and limitations. Recent findings indicate that small, unnoticeable perturbations of graph structure can catastrophically reduce the performance of even the strongest and most popular GNNs (1). It is very exciting to explore more understanding of GNN’s properties, so we could use them to improve current performances, optimize the usage, or against adversarial attacks on GNNs.

The main task of the thesis will be to apply an existing GNN framework to learn interesting functions on graphs. By performing some numerical analysis on crafted examples or existing benchmarks, try to summarize some interesting properties, possible intuitions, or figure out the limitations.

The experiments and goals are very flexible. For example, some interesting properties could be explored by outputting and analyzing the changes in the weights while training. And some intuitions can be caught when we slightly change the structure of the GNN. If the current goal is too simple for you, we can consider extended goals: Summarize the most valuable properties found by this study and further verify and formalize these properties. Based on found limitations or vulnerabilities. Explore possible variants that can improve corresponding GNN performance and robustness.

From this project, you can learn and practice the cutting-edge techniques of deep learning and explore its security problems. It is particularly good for people who want to learn deep learning at an advanced level.

Contact: Chencheng Liang, Philipp Rümmer

Prerequisites: basic knowledge about neural networks, Python

# Program Verification

### Formal Verification Methods for the TACC Framework

A thesis done together with the company Arista Networks.

### Extending TriCera to Parse ACSL Annotations

# SMT Solving, Constraint Solving, Theorem Proving

### Craig interpolation for mixed-integer linear programs

Craig interpolation, a principle known to logicians since the 1950s, has recently emerged in formal verification as a practical approximation method. Its applications range from efficient image computations in SAT-based model checking to accelerating convergence of fixpoint calculations for infinite-state systems. In software verification, interpolation is applied to formulae encoding the transition relation of a model underlying a program. In order to support expressive programming languages, much effort has been invested in the design of algorithms that compute interpolants for formulae of various theories.

We have recently developed interpolation procedures for the theory of linear integer arithmetic, and some extensions thereof. Similarly, interpolation procedures for linear rational arithmetic have been known and used for several years. An open question, to be addressed in this thesis, is how formulae in the combined theory of integer and rational arithmetic (mixed-integer arithmetic) can be interpolated. Such procedures would be applicable, for instance, for the verification of timed, hybrid, or cyber-physical systems. The thesis should investigate the design of interpolating calculi, decision procedures, and practical implementations of such procedures for mixed-integer linear arithmetic.

Contact: Philipp Rümmer

Prerequisites: knowledge about linear algebra + optimisation, basic knowledge about logic and theorem proving

### Benchmarking infrastructure for solvers

We are currently running most of your benchmarks interactively, without much automation. Your job would be to help us prototype a system for automating benchmarks. You would investigate current best practices as well as requirements from several conferences, and investigate both the overhead of your chosen method and the scientific implications, and then work with us to prototype a system for our performance benchmarking. Details are of course subject to discussion.

This is a good project if you like to work on infrastructure and automation. The implementation would very likely involve Docker, so any experience with that would be beneficial.

Contact: Amanda Stjerna (**applications closed, sorry!**)

### Prototyping an mcSAT solver in Rust

Rust is an increasingly popular low-level language gunning to replace C++ and in some cases C. We are interested in wether Rust is a) powerful enough to conveniently implement a theorem prover/constraint solver, and b) fast enough to be worth it. The project would involve prototyping some bits of a model-constructing SMT (mcSAT) solver to determine if the language is flexible enough, and some bits of the logic to determine if the full thing would be fast enough to be viable. A large emphasis will be placed on API design in Rust, and this should be something you are interested in.

This project is probably more useful as a master’s project, but could be scope-cut to suit a bachelor’s, with some prerequisite knowledge. It's also possible to do as a collaboration. You should have taken at least Algorithms and data structures III, or be willing to do some studying beforehand to understand the context of what an SMT solver does.

Note: an instance of this project has just started, but if you are interested, it might be possible to spin up multiple concurrent instances and even to have them usefully interact.

Contact: Amanda Stjerna (**applications closed, sorry!**)

### Extend bnfc to generate Scala

Bnfc is a project to generate parsers, pretty printers, abstract syntax trees, etc for BNF grammars, written in Haskell. We use it to express grammars, both an S-expression-based grammar for SMT-LIB and the JavaScript regular expression language. However, bnfc cannot currently generate code for Scala, only Java. Your job would be to extend the code generation to produce Scala, a process that can be tailored to your ability. The final product would also involve running a performance test of your product (and ideally optimising it).

Scope-cuts are possible to make the project smaller.

Contact: Amanda Stjerna (**applications closed, sorry!**)

# Multi-agent Systems

### Strategic Reasoning on Smart Contracts

Keywords: Formal methods, multi-agent systems, strategic reasoning, smart contracts, distributed ledger technologies, Ethereum blockchain.

A distributed ledger is digital data replicated among peers (i.e., not hierarchically organized node) in a network. n the Bitcoin network, for example, the distributed data hold account balances and monetary transactions. Some distributed ledgers take a step forward: distributed data contain not only information such as users accounts, but also programming code that can be executed (and that can modify the data). Such code is called a Smart Contract (SC).

SC can be used for a plethora of applications where several parties define the terms of an agreement, a contract in fact, which defines the future actions that each party is committed to carry out, such as: employment contracts, mortgage loans, supply chain management, protection of copyright content. Pushed to the limit, we have the concept of decentralized autonomous organization (DAO), a complex SCs which runs the managerial activities of a company, virtually without human intervention.

The main idea is to adopt an agent-centric perspective: we view a distributed ledger as a multi-agent system where agents are SCs which autonomously interact with each other. While traditional techniques focus on verifying (code of) SCs in isolation, this approach enables advanced form of reasoning involving the collective behavior of groups of SCs, such as: “If SCs A, B and C act as a coalition, can they manipulate the market price of good g?” (strategic reasoning) or "What behavior should a SC have so as to block A, B and C to control the market?" (synthesis).

The starting point is to build a formal model of SCs, which abstracts from implementation details of the several existing distributed ledgers while capturing the relevant features needed to carry out the reasoning techniques. Such a model is valuable per se as it give a high-level semantics of SCs in a distributed ledger, but it also enables the new kind of formal reasoning informally described above.

Contact: Riccardo De Masellis

### Dynamic Multi-agent Systems

Keywords: logic, multi-agent systems, strategic reasoning

Prerequisites: knowledge of logic

Multi-agent systems (MAS) are a modeling paradigm which sees the world as composed by agents which, by performing actions, collaborate or compete with each other in order to achieve their objectives. We consider discrete and concurrent multi-agent systems: at each step, each agent chooses an action and the system evolves in a new state depending on the joint actions of all agents. Given a MAS, we can verify strategic properties of the kind: “If agents a1 and a2 cooperate, they have a strategy (namely, a sequence of actions) to guarantee objective g for every possible system evolution regardless of the actions of competing agents a3, a4 and a5”. See [1] for the reference framework.

While in traditional MAS (such as [1]) the set of agents is defined at design-time, dynamic MAS (DMAS) allows for a (possibly unbounded) number of agents to join and leave the system during the evolution, as described in [2]. Now strategic reasoning becomes: “a coalition of at least n cooperating agents can ensure against (at most) m competing agents that any possible evolution of the system satisfy objective g”, where each of n and m can be a fixed number or a variable that can be quantified over.

**Project 1** (practical): implement the model-checking reasoning procedure described in [2], by using external SMT solvers for Presburger formulas, and check its performances compared to other MAS model-checkers.

**Project 2** (theoretical): in [2], all agents have the same protocol, i.e., they can perform the same set of actions at each step. The objective it to extend [2] to allow for a finite number of different agent “types”. The property to be verified would then be: “A coalition of (at least): n1 of type A1, n2 of type A2, …, nx of type Ax cooperating agents can ensure against (at most) m1 of type A1, m2 of type A2, …, mx of type Ax that any possible evolution of the system satisfy objective g”.

[1] Rajeev Alur, Thomas A. Henzinger, and Orna Kupferman. Alternating-Time Temporal Logic. January 1998.

[2] Riccardo De Masellis and Valentin Goranko. Logic-based specification and verification of homogeneous dynamic multi-agent systems. Auton. Agents Multi Agent Syst. 34(2): 34 (2020).

Contact: Riccardo De Masellis