Debian sid contains more than 31.000 maintainer scripts, the vast majority of which are written in the POSIX shell language. We have written, in the context of the CoLiS project, the tool shstats which allows us to do a statistical analysis of a large corpus of shell scripts. The tool operates on concrete syntax trees of shell scripts constructed by the morbig shell parser. The morbig parser has already been presented at a devroom at FOSDEM 2018, and at the minidebconf 2018 in Hamburg. The shstats tool comes with a number of analysis modules, and it is easy to extend it by adding new modules.

In this talk we will present both the design of the analyzer tool and how it can be extended by new analysis modules, and some of the results we have obtained so far on the corpus of sid maintainer scripts. Among these are answers to questions like:

- are recursive functions used in maintainer scripts?
- how many variable assignments are in reality definition of constants?
- how many shell scripts don't do any variable assignments, besides definitions of constants?
- how often are shell constructs like while, for, if, etc. used?
- which UNIX commands are most commonly used in the corpus, and with which options?
- are there any syntax errors in the arguments of complex commands like test?

- Debconf 2018, Hsinchu (Taiwan), July 31, 2018.

Parsing the POSIX shell language is challenging: lexical analysis depends on the parsing context ; parsing is specified by an ambiguous BNF grammar annotated with adhoc rules ; the language specification is informal... and, the icing on the cake: statically parsing a shell script is actually undecidable!

Forget about your textbooks! Existing implementations of shell interpreters contain hand-crafted character-level non-modular syntactic analyzers and yes, their source code are very hard to read.

What if you had to program the parser of a shell script verifier? Would you accept to throw away your favorite lexer and parser generators which had let you write high-level declarative implementations for decades? Would you accept to mix lexing, parsing and evaluation in a single large bowl of spaghetti code? Of course not. You would hardly trust such a parser.

In this talk, we will introduce "Morbig", a modular, high-level and interoperable parser for POSIX shell. The implementation of Morbig relies on key features of Menhir, an LR(1) parser generator for OCaml.

Talk given at

- MiniDebconf 2018, Hamburg (Germany), May 18, 2018.

Program verification aims at obtaining a formal assurance that a program is correct with respect to a given specification. This is achieved by constructing a formal proof of correctness of the program. In contrast to program testing, the existence of a proof assures that the program behaves correctly in any situation described by the specification. Failure of an attempt to verify a program, on the other hand, can often be used to generate useful test cases.

A possible example of a program specification is absence of execution error under certain initial conditions. Automatic program verification even for this kind of specification is an extremely challenging task. In case of Debian maintainer scripts we are faced with even more challenging properties like idempotency of scripts (required by policy), or commutation of scripts.

The project is still in the beginning, so there are no results yet to present. However, I will explain why I think that the case of Debian maintainer scripts is very interesting for program verification : some aspects of scripts (POSIX shell, manipulation of a complex data structure) make the problem very difficult, while other aspects of the Debian case are likely to make the problem easier. I am also very interested in the inputs from the audience about possible use cases of script verification in the context of Debian.

Talk given at

- Debconf 2016, Cape Town (South Africa), July 8, 2016.

- which packages are installable and which are not?
- for which packages is it possible to install a build environment?
- which packages are absolutely mandatory for the installation of a package (which might have dependencies with alternatives, or on virtual packages)?
- which packages will certainly become uninstallable when I upgrade a certain package?

This is joint work with Pietro Abate, Yacine Boufkhad, Jaap Boender, Roberto Di Cosmo, Jérôme Vouillon, and Stefano Zacchiroli.

Talk given at the

- pkgsrcCon 2013, Berlin (Germany), March 23, 2013.

This is joint work with Pietro Abate, Roberto Di Cosmo, and Stefano Zacchiroli.

Talk given at the

- Debian Conference 2012, Managua (Nicaragua), July 14, 2012.

(or: How Algebra Helps to Solve Intruder Constraints)

The main goal of this paper is to set up a general approach that works for a whole class of so-called monoidal theories which contains many of the specific cases that have been considered so far in an ad-hoc way, e.g. exclusive or, Abelian groups, exclusive or in combination with the homomorphism axiom.

We follow a classical schema for cryptographic protocol analysis which proves
first a locality result and then reduces the insecurity problem to a symbolic
constraint solving problem. This approach strongly relies on the
correspondence between a monoidal theory E and a semiring
*Se _{E}* which we use to deal with the symbolic constraints. We
show that the well-defined symbolic constraints that are generated by
reasonable protocols can be solved provided that unification in the monoidal
theory satisfies some additional properties. The resolution process boils down
to solving particular quadratic Diophantine equations that are reduced to
linear Diophantine equations, thanks to linear algebra results and the
well-definedness of the problem. Examples of theories that do not satisfy our
additional properties appear to be undecidable, which suggests that our
characterization is reasonably tight.

This is joint work with Stéphanie Delaune, Pascal Lafourcade, and Denis Lugiez.

Talk given at

- Seminar of the Laboratoire d'Algorithmique, Complexité et Logique (LACL), Université Paris-12, Créteil, 12/2/2007.

This trace reachability problem can be expressed as a system of symbolic
deducibility constraints for a certain inference system describing the
capabilities of the attacker. One main step of our proof consists in reducing
deducibility constraints to constraints for deducibility in one step of the
inference system. This constraint system, in turn, can be expressed as a
system of quadratic equations of a particular form over *Z/2Z[h]*, the
ring of polynomials in one indeterminate over the finite field
*Z/2Z*. We show that satisfiability of such systems is
decidable.

This is joint work with Stéphanie Delaune, Pascal Lafourcade, and Denis Lugiez.

Talk given at

- Research Center for Verification and Semantics (CVS), AIST, Osaka, Japon, March 2006.

- l'absence d'un langage d'entrée commun et d'une sémantique commune des outils de vérification existants
- des langages de spécification de protocoles trop simples ou trop abstraits
- une abstraction des méthodes de chiffrement qui est trop éloignée des vraies implémentations de protocoles

- développement d'un langage de spécification de protocoles cryptographiques expressif avec une sémantique précise, et d'un langage générique de propriétés de protocoles dans lequel peuvent s'exprimer de nombreuses variantes des propriétés classiques
- intégration des outils existants de vérification issus des laboratoires partenaires du projet dans une boîte d'outils, en particulier développement des traducteurs du langage de spécification et du langage de propriétés vers les langages d'entrée des outils
- vérification de protocoles cryptographiques sous affaiblissement de l'hypothèse du chiffrement parfait, en particulier le remplacement de nonces par des timestamps ou des compteurs, et prise en compte des propriétés algébriques des primitives cryptographiques
- développement d'une interface commune avec une présentation uniforme de preuves et d'attaques

Talk given at

- IRISA, Rennes, 27/01/2005.

Talk given at

- the 7th International Workshop on Unification (UNIF'03), Valencia, Spain, 8/6/2003.

Tree constraints are first-order formulas built over a given set of predicate symbols with a fixed interpretation in a domain of trees. The semantics of such a tree constraint is a relation over trees. Tree automata, on the other hand, provide a mechanism to define sets of trees. Tree automata are well-investigated; they are known to enjoy basically the same nice properties than word automata.

Unfortunately, classical automata are not well equipped to recognize the kind of relation that we can get as the semantics of a constraint. What is missing is

- a means to encode a (non-unary) relation, that is to recognize a language of n-tuples of trees. The standard technique is to consider an alphabet of n-tuples of symbols.
- the ability of the automaton to perform tests, in particular equality and diseqality tests. This is necessary since tree constraints may contain explicit or implicit equations.

We will discuss the (mostly negative) results that we have obtained for our new class of tree automata, and we will illustrate the automata approach at hand of the (still open) problem of entailment of non-structural subtype constraints.

This is in parts joint work with Alex Aiken, Joachim Niehren, Tim Priesnitz, and Zhendong Su.

Talk given at

- Tsukuba Software Science Seminar, Tsukuba, Japan, 26/11/2001.
- 11th Seminar on Algebra, Logic and Geometry in Informatics (ALGI), Kinosaki, Japan, 4/12/2001.
- Séminaire du Laboratoire Spécification et Vérification, ENS Cachan, France, 15/1/2002.

Ralf Treinen