The Evolving of CASM: Modern Compiler Engineering and Empirical Guided Language Design for a Rigorous State-Based Method

PhD Thesis


The demand for applying rigorous methods in the field of software engineering, hardware engineering, and systems engineering is still as high as ever. Over the last decades, several Domain-Specific Languages (DSLs) and tools were created to tackle different kinds of design challenges, capturing of requirements, and providing proper development support through code generation techniques for software as well as hardware fields. Despite the intensive investigation and domain exploration in their respective fields, most of the engineering methods lack proper techniques to reason about the specified design. This is where formal methods come into the picture of all engineering fields. Industrial hardware engineering and development is mostly done or supported by formal method-based techniques, whereas in software and system engineering in general formal methods are still seen as a more academic discipline despite the effort already achieved by researchers all other the world. Especially for the software domain it becomes more and more relevant to industry because incorrect behavior, safety flaws, security breaches etc. can impact a company’s reputation and market share or it can even cost human lives in safety critical or embedded applications. As a result, Model-Driven System Engineering (MDSE) methods were created which provide rich Model-Driven Development (MDD) techniques for a specific engineering domain. The MDD technique includes a DSL and tool which consists of a dedicated parser, language validation analyzer, and code generator. Unfortunately, almost all MDSE methods are tailor-made formalisms to deal with a specific problem domain. Some of these techniques define and describe the language semantics through the code generation process, others allow for full or partially simulated systems’ properties of the specified design before the creation of the actual system. Furthermore, a described design in one DSL cannot be reused or easily moved through rewriting into another DSL because of the completely different domain-specific abstraction level of the specification languages. The Abstract State Machine (ASM) method is a state-based formal method which can be seen exactly as the missing piece in the described puzzle or specification dilemma above. It supports a domain independent way to capture a system’s behavior regardless of whether the specified system is a software, hardware, or even a mixed software/hardware system. Based on an ASM, specified system reasoning can be applied in order to check certain system properties or even proof certain aspects of the described system like safety constraints. Notable to mention is that ASM specifications are by default executable specifications, which means that any ASM-specified system can be simulated without even creating or deriving a concrete implementation. By definition, the latter can be derived either by refinement techniques or if supported through code generation just like MDSE/MDD does. Nevertheless, the tooling support for ASM-based specification languages regarding the interpretation (simulation) and compilation (code generation) lack modern compiler techniques and state-of-the-art language engineering. A decade ago, a research project started to address both issues – interpretation and compilation – and was made public by the author through an open-source reimplementation named Corinthian Abstract State Machine (CASM). This PhD thesis describes the incrementally derived ASM-based compiler foundation and framework for CASM in order to research and explore further (optimizing) compiler and interpreter potentials as well as give the ability to research new language design concepts. In the course of this work, a novel ASM-based model-based transformation framework was elaborated by using a multi-level Intermediate Representation (IR) compiler design in order to achieve flexible software and/or hardware code generation with optimization focus in the foreground as well as fast execution (interpretation/simulation) as a second major research target. The defined ASM-based IR allows exploring and defining ASM- related compiler optimizations through a well-defined model and to provide a unified interface for other ASM-based language engineers and tool developers to e.g. reuse the implementation in CASM. Furthermore, an improved symbolic execution effort was achieved in this thesis for CASM as well, which is based in the ASM-based IR. This PhD thesis reports on the investigation and introduction of an object-oriented language construct for ASM-based languages, which was derived in an incremen- tal process applying two controlled experiments and one eye-tracking study. The first controlled experiment compared the understandability of three object-oriented abstractions introduced in an ASM-based language namely interfaces, mixins, and traits. Results showed that interfaces and traits have a similar good understanding, which lead to a follow-up controlled experiment investigating the usability of the two object-oriented language constructs interfaces and traits in the context of an ASM-based language syntax extension. A significant difference was discovered in the results, namely that the traits language construct is more usable compared to the interfaces language construct. Based on this insight, we conducted another controlled experiment in the form of an eye-tracking experiment for the trait-based language construct to obtain knowledge about the comprehensability of the syntax extension by analyzing the eye-gaze patterns as well as the eye fixation behavior. The outcome of these conducted studies is manifested in a novel trait-based object-oriented language construct in CASM, which provides an ability for ASM-based languages to easily describe ASM language properties in the language itself.



% BibTex
  title        = {{The Evolving of CASM: Modern Compiler Engineering and Empirical Guided Language Design for a Rigorous State-Based Method}},
  author       = {Philipp Paulweber},
  booktitle    = {PhD Thesis},
  year         = {2022},
  organization = {University of Vienna (Universität Wien)}