# Big Data

### Teaching team: Alain Barrat, Bruno Goncalves, Nicolas Schabanel

 Location: École Normale Supérieure de Lyon [Map] Jacques Monod Campus Amphithéâtre B (Main Building, 3rd or 4th floor)

Person in charge: N. Schabanel

### Schedule (from Monday Jan 19, 9:30 to Friday Jan 23, 16:45)

 Monday Tuesday Wednesday Thursday Friday 9:30-10:30 Networks 1 Algorithms 2 Networks 3 Exercises session 3 (Algorithms) Written exam(Algorithms and Networks) 10:45-11:45 Algorithms 1 Networks 2 Algorithms 3 Exercises session 4 (Networks) 13:30-15:30 Data mining 1  (Preliminaries) Data mining 2 (Twitter) Data mining 3 (Github) Data mining 4 (Instagram) Project defenses 15:45-16:45 (Langues) Exercises session 1 (Networks) Networks 4 (Langues) 17:00-18:00 Exercises session 2 (Algorithms) Algorithms 4

### Lecture contents

 Networks Algorithms Data Mining • Definitions, Statistical analysis, Models • Epidemic spreading on networks • Social network analysis • Streaming algorithms • Dynamical graphs algorithms • Algorithmic perspective on social networks • Preliminaries • Twitter • Github • Instagram

### Evaluation and Projects

• Algorithms and Networks will be evaluated on friday morning by a written exam.
• The data mining sessions will be dedicated to the development of research group projects on computers which will be defended on friday afternoon.

### General presentation

During this school we will review several important techniques to deal with huge amount of data: from the algorithmic point of view and from the data analysis point of view. Among others, we will learn techniques that allow to compute precisely parameters on the data when they don’t fit in computer’s memory, using so-called streaming algorithms and property-testing technique. We will also learn about techniques to discover structures on data related to the clustering, cut and facility location problems using linear and semi-definite programming, first component decomposition (recommendation systems), belief propagation and compressive sensing (from an algorithmic perspective). We will review specific techniques to deal with the possible dynamic features of the data as well.

In a second part, we will describe how to gather 'big data' in practice, in particular by using the public APIs of Twitter and Facebook. We will learn how to actively collect large scale datasets that are able to provide us with a unique view on Human Social Connections and Mobility, Collective Attention and Information Spreading. We will moreover discuss tools, which are customarily used for the statistical characterization of large networks, modeling frameworks whose development has been stimulated by the empirically observed characteristics of many real-world networks, and describe the modeling of dynamical phenomena, which take place on complex networks, such as epidemic spreading, information propagation or opinion formation.

This research school is organized as a part of the Modeling Complex Systems Master 2 program co-affiliated by the Computer Science Department and Physics Department of ENS Lyon and hosted by the IXXI Rhone-Alpes Complex System Institute. Responsible are Dr. Márton Karsai (marton.karsai@ens-lyon.fr) and Dr. Pierre Borgnat (pierre.borgnat@ens-lyon.fr). Students with orientations in Computer Science and Physics are expected to participate, however students from other disciplines are also welcome. Registration letter should be sent to Dr. Schabanel and the program responsibles.