February 16, 2015

Let data (science) speak

The Doctoral College at IFPEN (IFP Energies nouvelles) organizes seminars for PhD students. The next one on 30 March 2015 is about Data Science: "Faire parler les mesures, de la capture (acquisition) aux premiers mots (apprentissage) : la science des données, une discipline émergente" or "Let data speak: from its capture (acquisition) to its  first words (learning): data science, an emerging discipline". 

The invitees are Igor Carron (Nuit Blanche), Laurent Daudet (Institut Langevin) and Stéphane Mallat (École normale supérieure). Abstracts and slides follow, with two musical interludes:
For those who could not attend, or for a second shot in video:
Laurent Daudet
Stéphane Mallat :
Laurent Duval, Aurélie Pirayre, IFPEN

*Titre : introduction à la science des données
*Résumé : La science des données (ou dédoménologie) est une discipline émergente : le terme "data science" apparaît en 2001 et désigne un ensemble de techniques empruntant aux sciences de l'information, aux mathématiques, aux statistiques, à l'informatique, à la visualisation, à l'apprentissage automatique. Elle vise à extraire de la connaissance de données expérimentales, potentiellement complexes, volumineuses ou hétérogènes, en révélant des motifs ou structures peu explicites. Ce domaine est notamment tiré par le GAFA (Google, Apple, Facebook, Amazon), et joue un rôle croissant en biologie, en médecine, en sciences sociales, en astronomie ainsi qu'en physique.

Les exposés illustrent quelques facettes de cette discipline : comment exploiter une forme de hasard dans les mesures, comment voir à travers la peinture, comment apprendre à classifier par la non-linéarité ?

Data science appeared in 2001 as an emerging discipline. It designates a corpus of techniques derived from information sciences, mathematics, statistics, computer science, visualization, machine learning. It aims at extracting knowledge from (experimental) data, potentially complex, huge or heterogeneous, by unravelling weakly explicit patterns. This field is partly driven by GAFA companies (Google, Apple, Facebook, Amazon), and plays an increasing role in biology, medicine, social sciences, astronomy or physics.

The different talks shed a light on some aspects of this discipline : how to exploit randomness in measurements, how to see though the paint, how to learn to classify with non-linearities?

Igor Carron

*Title: "Ca va être compliqué": Islands of knowledge, Mathematician-Pirates and the Great Convergence
*Abstract: In this talk, we will survey the different techniques that have led to recent changes in the way we do sensing and how to make sense of that information. In particular, we will talk about problem complexity and attendant algorithms, compressive sensing, advanced matrix factorization, sensing hardware and machine learning and how all these seemingly unrelated issues are of importance to the practising engineer. In particular, we'll draw some parallel between some of the techniques currently used in machine learning as used by internet companies and the upcoming convergence that will occur in many fields of Engineering and Science as a result.

Laurent Daudet, Institut Langevin, Ondes et images

*Title: Compressed Sensing Imaging through multiply scattering materials (Un imageur compressé utilisant les milieux multiplement diffusants)
*Abstract: The recent theory of compressive sensing leverages upon the structure of signals to acquire them with much fewer measurements than was previously thought necessary, and certainly well below the traditional Nyquist-Shannon sampling rate. However, most implementations developed to take advantage of this framework revolve around controlling the measurements with carefully engineered material or acquisition sequences. Instead, we use the natural randomness of wave propagation through multiply scattering media as an optimal and instantaneous compressive imaging mechanism. Waves reflected from an object are detected after propagation through a well-characterized complex medium. Each local measurement thus contains global information about the object, yielding a purely analog compressive sensing method. We experimentally demonstrate the effectiveness of the proposed approach for optical imaging by using a 300-micrometer thick layer of white paint as the compressive imaging device. Scattering media are thus promising candidates for designing efficient and compact compressive imagers.
(joint work with I. Carron, G. Chardon, A. Drémeau, S. Gigan, O. Katz, F. Krzakala, G. Lerosey, A. Liutkus, D. Martina, S. Popoff)

Stéphane Mallat, École Normale Supérieure

*Title: Learning Signals, Images and Physics with Deep Neural Networks
*Abstract: Big data, huge memory and computational capacity are opening a scientific world which did not seem reachable just few years ago. Besides brute-force computational power, algorithms are evolving quickly. In particular, deep neural networks provide impressive classification results for many types of signals, images and data sets. It is thus time to wonder what type of information is extracted by these network architectures, and why they work so well.

Learning does not seem to be the key element of this story. Multirate filter banks together with non-linearities can compute multiscale invariants, which appear to provide stable representations of complex geometric structures and random processes.  This will be illustrated through audio and image classification problems. We also show that such architectures can learn complex physical functionals, such as quantum chemistry energies.