Project description
TITLE: Advanced Typing of Map Data Structures in Elixir
This project is proposed in collaboration with Dashbit .

Context

Elixir is an open-source dynamic functional programming language that runs on the Erlang Virtual Machine [1], designed for building scalable and maintainable applications. Its characteristics have earned it a surging adoption by hundreds of industrial actors and tens of thousands of developers. It is used by companies such as Discord, PepsiCo, and Pinterest.
Static typing seems nowadays to be the most important request coming from the Elixir community. IRIF has an ongoing collaboration with the Elixir development team to define and implement static type-checking for Elixir. The current type system we are implementing for Elixir is described in [2]. It is a type system based on the theory of semantic subtyping [3,4], thanks to which union, intersection, and negation types (a.k.a. set-theoretic types) can be added to existing languages. It features cutting edge typing techniques: parametric polymorphism with local type inference, overloading, occurrence typing, gradual typing, and a sophisticated type analysis of patterns and guards. It is currently (and gradually) being integrated in Elixir. One important aspect of the type system is an accurate typing of maps, which are a fundamental data structure in Elixir; this is the subject of this project.

Goals

The typing of maps [5] focuses on two key forms: structs and dictionaries. Structs associate different keys with different types, requiring nominal keys for access, whereas dictionaries link all keys to a single type, with non-existent keys returning a default value. Our type theory accommodates both these structures, enabling Elixir maps to adapt to either struct-like or dictionary-like usage seamlessly. Elixir maps can therefore have several domain key types which are not all necessarily present (e.g. integers are sent to atom types, and atoms are sent to tuples). Importantly, domains are non-overlapping; this is a key limitation that comes both from design consideration and from the fact that the theory of semantic subtyping, used to compute subtyping between types, disregards the order of the fields. The type system for maps presented at this year's ICFP conference [5] proposes a solution where domains are pre-defined, and represent simple base types such as integer, atom, tuple, list, or the super-type of all types, noted term. However, in practice, some more fine-grained key types could be useful, for instance types {term, term} and {term, term, term}, which respectively represent tuples of size 2 and 3, but are subtypes of the type tuple. The project aims to explore the challenges of integrating more fine-grained key types into the type-checker for maps in Elixir. The objectives include:
  1. Developing a method to represent a disjoint partition of the complete set of types, where the list of possible domains for map keys must be comprehensive yet distinct.
  2. Creating a mechanism to reconcile differences in type partitions across different programs, enabling comparison and compatibility.
  3. Designing a system where these partitions can be precisely defined and adapted fluidly, enhancing the typing system's accuracy and flexibility.
The research will be carried out at the Research Institute on the Foundations of Computer Science (IRIF) in Paris under the supervision of Giuseppe Castagna (principal investigator for the Elixir Type System, directeur de recherche CNRS at IRIF), Guillaume Duboc (PhD student whose thesis subject consists in the design and implementation of the Elixir type system), and collaboration with José Valim (creator and main developer of the Elixir programming language, at Dashbit). The project will explore both theoretical (reflection on the structure of map types, design of the typing rules) and practical (e.g., languages design and implementation) aspects, in a proportion that will depend on the profile of the candidate. The perspective of this proposal is that the results of this project will be integrated in the compiler of Elixir.
References
[3,5] are mandatory reading