Institut de Recherche en Informatique Fondamentale(IRIF)


image/svg+xml

IRIF, the Research Institute on the Foundations of Computer Science, is a research laboratory of CNRS and Université de Paris, also hosting two Inria project-teams.

The research conducted at IRIF is based on the study and understanding of the foundations of all computer science, in order to provide innovative solutions to the current and future challenges of digital sciences.

IRIF hosts about 200 people. Six of its members have been distinguished by the European Research Council (ERC), five are members of the Institut Universitaire de France IUF), and two are members of the Academia Europæa.

Michele Pagani

10.12.2019
Michele Pagani (IRIF), Alois Brunel (Deepomatic) and Damiano Mazza (LIPN) will present at POPL'20 an effect-free extension of the backpropagation algorithm to higher-order functional programming, allowing for a logical understanding of its dynamics thanks to linear logic.

Claire Mathieu

8.10.2019
The SODA 2020 conference will include a paper by Vincent Cohen-Addad (LIP6), Frederick Mallmann-Trenn (King's College) and Claire Mathieu (IRIF) about computing with noisy data. The problem: select valuable objects in a setting where each assessment has a probability of error, using redundant assessments.

perso-claire-mathieu.jpg

16.10.2019
On December 16th, a half-day of talks aimed at a non-specialized audience will take place at IRIF in celebration of Algorithms, the research domain of Claire Mathieu, 2019 recipient of a CNRS Silver Medal. The event will conclude with a discussion of new research directions in Algorithms. Free Registration before November 30th.

Yann Régis-Gianas

20.11.2019
Yann Regis-Gianas (IRIF) co-organizes at IRIF the workshop “Tezos Smart Contrat Languages and Formal Verification” on the 21 and 22 of November. Registration is free but mandatory.

(These news are displayed using a randomized-priority ranking.)

Verification
Wednesday December 11, 2019, 11AM, Salle 1007
Yotam Feldman (Tel Aviv University) Complexity and Information in Invariant Inference

This work addresses the complexity of SAT-based invariant inference, a prominent approach to safety verification. We consider the problem of inferring an inductive invariant of polynomial length given a transition system and a safety property. We analyze the complexity of this problem in a black-box model, called the Hoare-query model, which is general enough to capture algorithms such as IC3/PDR and its variants. An algorithm in this model learns about the system's reachable states by querying the validity of Hoare triples.

We show that in general an algorithm in the Hoare-query model requires an exponential number of queries. Our lower bound is information-theoretic and applies even to computationally unrestricted algorithms, showing that no choice of generalization from the partial information obtained in a polynomial number of Hoare queries can lead to an efficient invariant inference procedure in this class.

We then show, for the first time, that by utilizing rich Hoare queries, as done in PDR, inference can be exponentially more efficient than approaches such as ICE learning, which only utilize inductiveness checks of candidates. We do so by constructing a class of transition systems for which a simple version of PDR with a single frame infers invariants in a polynomial number of queries, whereas every algorithm using only inductiveness checks and counterexamples requires an exponential number of queries.

Our results also shed light on connections and differences with the classical theory of exact concept learning with queries, and imply that learning from counterexamples to induction is harder than classical exact learning from labeled examples. This demonstrates that the convergence rate of Counterexample-Guided Inductive Synthesis depends on the form of counterexamples.

Joint work with Neil Immerman, Mooly Sagiv, and Sharon Shoham, that will appear in POPL'20.

Type theory and realisability
Wednesday December 11, 2019, 2PM, Salle 1007
Danny Gratzer Type Theory à la Mode

Modalities have proved to be a recurring phenomenon in type theories, arising in both mathematical and computational contexts. For instance, recent work on cubical type theory has simplified the construction of univalent universes using the internal type theory of cubical sets supplemented with a comonad. On the other hand, guarded type theory has provided an account of coinduction using a handful of modalities.

Despite the wide variety of uses, a general framework for modal dependent type theories has still proven elusive. Each modal situation requires a handcrafted type theory, and establishing the properties of these type theories requires a significant technical investment.

In this work, we have attempted to isolate a class of modal situations which can be given a single uniform syntax and allow for the typical metatheorems of type theory to be proven once and for all. Our calculus takes a mode theory as a parameter (an abstract description of the collection of modes, the modalities between them, and the ways in which they interactions) and produces a full dependent type theory satisfying canonicity.

Our approach generalizes the ideas latent in Nuyts, Vezzosi, and Devriese, and simplifies the forthcoming work by Licata, Shulman, and Riley on a subset of mode theories. This calculus is sufficiently flexible to model internal parametricity, guarded recursion, and even axiomatic cohesion.

Proofs, programs and systems
Thursday December 12, 2019, 10:30AM, École normale supérieure de Lyon
Amina Doumane, Cristina Matache Séminaire Chocola

PhD defences
Thursday December 12, 2019, 2PM, Salle 2014, Bâtiment Sophie Germain
Théo Zimmermann (IRIF) Challenges in the collaborative evolution of a proof language and its ecosystem

In this thesis, I present the application of software engineering methods and knowledge to the development, maintenance, and evolution of Coq —an interactive proof assistant based on type theory— and its package ecosystem. Coq has been developed at Inria since 1984, but has only more recently seen a surge in its user base, which leads to much stronger concerns about its maintainability, and the involvement of external contributors in the evolution of both Coq, and its ecosystem of plugins and libraries.

Recent years have seen important changes in the development processes of Coq, of which I have been a witness and an actor (adoption of GitHub as a development platform, first for its pull request mechanism, then for its bug tracker, adoption of continuous integration, switch to shorter release cycles, increased involvement of external contributors in the open source development and maintenance process). The contributions of this thesis include a historical description of these changes, the refinement of existing processes, and the design of new ones, the design and implementation of new tools to help the application of these processes, and the validation of these changes through rigorous empirical evaluation.

Involving external contributors is also very useful at the level of the package ecosystem. This thesis additionally contains an analysis of package distribution methods, and a focus on the problem of the long-term maintenance of single-maintainer packages.

Manuscript

IRIF Cake
Thursday December 12, 2019, 5PM, In front of room 3052
Geoffroy Couteau, Gaëtan Douéneau, Chaitanya Leena-Subramaniam (IRIF CakeTM) Gâteau de l'IRIF

IRIF CakeTM is an amazing opportunity to meet people while simultaneously eating cakes baked by your fellow colleagues! Join us every Thursday, at 5pm, in front of room 3052 (Sophie Germain 3rd floor) for a weekly feast. You can also express your cooking skills and volunteer to bake a cake by sending an email to cake@irif.fr.

Higher categories, polygraphs and homotopy
Friday December 13, 2019, 2PM, Salle 1007
Chaitanya Leena-Subramaniam (IRIF) Opetopic algebras II: Homotopy-coherent opetopic algebras

Automata
Friday December 13, 2019, 2:30PM, Salle 3052
Guillaume Lagarde (LABRI) Trade-offs Between Size and Degree in Polynomial Calculus

Introduced by Cleggs et al. (STOC'96) to capture Gröbner basis computations, Polynomial Calculus Resolution (PCR) is an algebraic proof system for certifying the unsatisfiability of CNF formulas. Impagliazzo et al. (CC'99) established that if an unsatisfiable k-CNF formula over n variables has a PCR refutation of small size (that is, polynomial size), then this formula also has a refutation of small degree, i.e., O(sqrt(n log n)). A natural question is to know whether we can achieve both small size and small degree in the same refutation. A situation similar in spirit arises in the more classical resolution proof system where degree is replaced by width and size by length. In this setting, Thapen (TOC '16) adressed the tradeoff question by providing a negative answer: the decrease in width necessarily comes at the expense of an exponential blow-up in length. Extending ideas from Thapen, our main result is to prove that a strong size-degree tradeoff is also necessary in PCR.

Joint work with Jakob Nordström, Dmitry Sokolov, Joseph Swernofsky

Algorithms and complexity
Monday December 16, 2019, 12:30AM, Amphi Turing
M. Teillaud, T. Starikovskaya, M. Bonamy, C. Mathieu Autour des algorithmes

This half-day of talks is in celebration of Algorithms, the research domain of Claire Mathieu, 2019 recipient of a CNRS Silver Medal. The talks, aimed at a non-specialized audience, will present a sample of research on Algorithms in France in a few selected areas: Computational Geometry (Monique Teillaud), Strings (Tatiana Starikoskaya), Graphs and Complexity (Marthe Bonamy), and Social Choice (Claire Mathieu). The afternoon will conclude with a discussion of new research directions in Algorithms. The participants to the roundtable will give their viewpoint on problems in the design and analysis of algorithms arising from their respective research domains. We may hear about statistical machine learning, clustering, and social networks (Anne Boyer), data journalism, detection of fake news, and magaging data in the Cloud (Ioana Manolescu), and economics and society (Camille Terrier).

Programme :

12h30-13h15 : Accueil et café

13h15-13h30 : Ouverture

13h30-14h15 : Exposé de Monique Teillaud, Autour des triangulations de Delaunay

14h15-15h : Exposé de Tatiana Starikovskaya, Streaming algorithms for string processing

15h-15h45 : Exposé de Marthe Bonamy, Complexity of graph coloring

15h45-16h15 : goûter

16h15-17h : Exposé de Claire Mathieu, Rank Aggregation

17h-18h : Table ronde “Prospectives de l'algorithmique en France” animée par Claire Mathieu. Participantes : Anne Boyer, Ioana Manolescu et Camille Terrier.

Verification
Monday December 16, 2019, 11AM, Salle 1007
Jules Villard (Facebook) The Infer Static Analysis platform

Infer is an open-source static analysis platform for Java, C, C++, and Objective-C. Infer is deployed at several companies where it helps developers write better code.

This talk will present how Infer is used at Facebook, where it analyses thousands of code changes every month, leading to thousands of bugs being found and fixed before they reach users. We will then see how to write your own inter-procedural static analysis in just a few lines of code inside Infer, and automatically be able to apply it to millions of lines of code.

PhD defences
Monday December 16, 2019, 2:30PM, Salle 2017, Bâtiment Sophie Germain
Adrien Husson (IRIF) Logical foundations of a modelling assistant for molecular biology

This thesis addresses the issue of “Executable Knowledge Representation” in the context of molecular biology. We introduce the foundation of a logical framework, termed iota, whose aim is to facilitate knowledge collation of molecular interactions at the level of proteins and at the same time allows the modeler to compile a reasonable fragment of the logic into a finite set of executable graph rewriting rules. We define a logic FO[↓] over cell state transitions. States represent cell contents; domain elements are protein parts and relations are protein-protein bindings. The unary logical operator ↓ selects transitions where as little as possible happens. Formulas over transitions also may runs, which are finite or infinite sequences of transitions. Every transition formula is also associated to a set of rewriting rules equipped with an operational semantics. We introduce two deductive systems that act as “typing” for formulas. We show that if a formula is typable in the first system then the execution of its associated rule set produces exactly the runs denoted by the formula, and that if it is typable in the second system then its associated rule set is finite. We introduce a grammar that produces formulas typable in both systems, up to logical equivalence. Finally we study decidability and definability properties of fragments of FO[↓]. In particular, we show that formulas typable in the second system are in a tight fragment of FO, which implies that the operator ↓ can then be eliminated.

Manuscript

Automata
Tuesday December 17, 2019, 2:30PM, Salle 0010
Achim Blumensath (Masaryk University) To be announced.
Noter la salle et l'horaire inhabituels.

Algorithms and complexity
Tuesday December 17, 2019, 11AM, Salle 1007
Clément Cannone (IBM Research) Uniformity Testing for High-Dimensional Distributions: Subcube Conditioning, Random Restrictions, and Mean Testing

Given a distribution p​​ on {-1,1}^d​​, we want to test whether p​​ is uniform. If p is assumed to be a product distribution, this can be done with Theta(sqrt{d}) samples; without this assumption, then things get exponentially worse and Theta(sqrt{2^d}) are necessary and sufficient. Assume however we can condition on arbitrary bits: does the problem become easier? If so, how much?

This is the “subcube conditional sampling model”, first considered in Bhattacharyya and Chakraborty (2018). We provide a nearly optimal ~O(sqrt{d})-algorithm for testing uniformity in this model. The key component of our proof is a natural notion of random restriction for distributions on {-1,1}^d, and a quantitative analysis of how such a restriction affects the mean vector of the distribution. Along the way, we provide a /very/ nearly tight upper bound on the (standard) sample complexity of a natural problem, “mean testing.”

Joint work with Xi Chen, Gautam Kamath, Amit Levi, and Erik Waingarten.

Semantics
Tuesday December 17, 2019, 10:30AM, Salle 3052
Yannick Zakowski (Irisa/Inria) From representing recursive and impure programs in Coq to a modular formal semantics of LLVM IR

The DeepSpec research project is a cross institution, cross project investigation to push further the science of specification and verification of software artifacts. Its ambition is crystallized into four qualities that specifications should have: they should be rich, live, two-sided and formal.

In this talk, we will focus on the design and implementation of Interaction Trees (ITrees), a general-purpose data-structure for representing in Coq potentially divergent, effectful, programs. ITrees are built out of uninterpreted events and their continuations. They support compositional construction of interpreters from event handlers, which give meaning to events by defining their semantics as monadic actions. They are expressive enough to represent impure and potentially nonterminating, mutually recursive computations in Coq. Finally, they give rise to a theory enabling equational reasoning, up to weak bisimulation, about ITrees and monadic computations built from them. In contrast to other approaches, ITrees are executable via code extraction.

ITrees are expressive enough to serve as a language of specification for most projects of the DeepSpec ecosystem, making them a sort of Franca Lingua for formal specification, and helping in connecting projects together. Furthermore, their ability to be extracted make them compatible with ambitions of liveness and two-sidedness.

In a second part of this talk, we will focus on how ITrees are used in one of the DeepSpec projects in particular: Vellvm. More specifically, we will give an account of the ongoing effort to use ITrees as a mean to define a modular formal semantics of LLVM IR.

PhD defences
Tuesday December 17, 2019, 10:5AM, 0010
Mengchuan Zou Aspects of Efficiency in Selected Problems of Computation on Large Graphs

This thesis presents three works on different aspects of efficiency of algorithm design for large scale graph computations. In the first work, we consider a setting of classical centralized computing, and we consider the question of generalizing modular decompositions and designing time- efficient algorithm for this problem. Modular decomposition, and more broadly module detection, are ways to reveal and analyze modular properties in structured data. As the classical modular decomposition is well studied and have an optimal linear-time algorithm, we firstly study the generalizations of these concepts to hy- pergraphs and present here positive results obtained for three definitions of modular decomposition in hypergraphs from the literature. We also consider the generaliza- tion of allowing errors in classical graph modules and present negative results for two this kind of definitions. The second work focuses on graph data query scenarios. Here the model differs from classical computing scenarios in that we are not designing algorithms to solve an original problem, but we assume that there is an oracle which provides partial information about the solution to the original problem, where oracle queries have time or resource consumption, which we model as costs, and we need to have an algorithm deciding how to efficiently query the oracle to get the exact solution to the original problem, thus here the efficiency is addressing to the query costs. We study the generalized binary search problem for which we compute an efficient query strategy to find a hidden target in graphs. We present the results of our work on approximating the optimal strategy of generalized binary search on weighted trees. Our third work draws attention to the question of memory efficiency. The setup in which we perform our computations is distributed and memory-restricted. Specif- ically, every node stores its local data, exchanging data by message passing, and is able to proceed local computations. This is similar to the LOCAL/CONGEST model in distributed computing, but our model additionally requires that every node can only store a constant number of variables w.r.t. its degree. This model can also describe natural algorithms. We implement an existing procedure of multiplicative reweighting for approximating the maximum s–t flow problem on this model, this type of methodology may potentially provide new opportunities for the field of local or natural algorithms. From a methodological point of view, the three types of efficiency concerns cor- respond to the following types of scenarios: the first one is the most classical one – given the problem, we try to design by hand the more efficient algorithm; the second one, the efficiency is regarded as an objective function – where we model query costs as an objective function, and using approximation algorithm techniques to get a good design of efficient strategy; the third one, the efficiency is in fact posed as a constraint of memory and we design algorithm under this constraint.