Sacl-AI 4 Science Workshop

Name: Sacl-AI 4 Science Workshop
Start: 2024-07-08T08:00:00+02:00
End: 2024-07-12T16:45:00+02:00
Location: BÂTIMENT D’ENSEIGNEMENT MUTUALISÉ (BEM)

8–12 juil. 2024

BÂTIMENT D’ENSEIGNEMENT MUTUALISÉ (BEM)

Fuseau horaire Europe/Paris

Liste des Contributions

26. Statistical learning and model validation

Olivier Colliot

08/07/2024 09:30

Machine Learning for Science School

Goal: Introduce the basics of ML and describe in details how to perform validation

History and terminology
Problem setup for ML basics (Model, loss, learning procedure, features)
Generalization in ML (overfitting, underfitting and model selection)
Validation (performance metrics, validation strategies, statistical analysis)

27. The scikit-learn API

Guillaume Lemaitre

08/07/2024 14:00

Machine Learning for Science School

Goal: Introduce the scikit-learn API, with a focus on practical insights on the model validation and selection.

- Overview of a simple cross-validation scheme k-fold
- Overview of metrics (Regression, Classification)
- Model selection through SearchCV
- Cross validation in complex settings (stratification, groups, non-iid data)

28. Learning with non-tabular data

Thomas Moreau (Inria)

09/07/2024 09:30

Machine Learning for Science School

Goal: Introduce the different types of data, with a focus on time-series, and the different methodologies to apply on each type.

- Overview of the different types of data: tabular data, time series, images, graph, signals.
- Overview of the specific problems and jargon with time series and signals.
- How to get back to a “classical” ML framework?
- Practical illustrations...

29. Introduction to deep learning

Romain Ménégaux

09/07/2024 14:00

Machine Learning for Science School

Goal: Describe the main types of deep learning architectures, and apply them to a concrete example from life sciences.

- Introduction: what is deep learning and why is everyone doing it?
- Overview of the main types of deep learning architectures: MLP, convolutional, and transformers. When to use one or the other?
- Overview of the different training and regularization...

13. Machine Learning for Quantum Simulation

Filippo Vicentini (École Polytechnique - CPHT)

10/07/2024 09:30

Invited talks

Talk (Long)

Material and Quantum Physics

TBC

30. The quantum many-body problem and properties of materials: how to profit from machine learning?

Lucia Reining

10/07/2024 10:15

Material and Quantum Physics

The understanding and prediction of properties of materials is a quantum many-body problem, and the observed phenomena often go well beyond the range that can be described with simple models. Recently, machine learning has emerged as a new tool that could potentially capture materials-specific or hidden universal features, and therefore help to analyse or design materials, and to improve...

14. Assisting sampling of equilibrium physical states with generative models

Marylou Gabrié (École Polytechnique)

10/07/2024 11:30

Invited talks

Talk (Long)

Material and Quantum Physics

Deep generative models parametrize very flexible families of distributions able to fit complicated datasets of images or text. These models provide independent samples from complex high-distributions at negligible costs. On the other hand, sampling exactly a target distribution, such the Boltzmann distribution of a physical system, is typically challenging: either because of dimensionality,...

17. Quantum physics for machine learning

Dr Danijela Markovic (CNRS Thales)

10/07/2024 14:00

Invited talks

Talk (Long)

Physics for Machine Learning

Quantum computing aims to leverage the principles of quantum mechanics, such as superposition, to encode and process information in ways that classical computers cannot, potentially handling exponentially larger amounts of information. However, harnessing this computational advantage requires quantum algorithms capable of encoding data into superpositions and providing answers with minimal...

19. Understanding uncertainty in machine learning with tractable models

Dr Bruno Loureiro (ENS Ulm)

10/07/2024 14:45

Invited talks

Talk (Long)

Physics for Machine Learning

Measuring the uncertainty associated to a model's prediction is a central part of statistical practice. In the context of modern deep learning practice, several methods for quantifying the uncertainty of neural networks co-exist. Yet, theoretical guarantees for these methods are scarce in the theoretical literature. In this talk, I will discuss how some of them compare in a mathematically...

10. Designing Molecular RNA Switches with Restricted Boltzmann Machines

Jorge FERNANDEZ-DE-COSSIO-DIAZ (ENS Paris)

10/07/2024 16:00

Contributed talks

Talk (Short)

Contributed Talks

Riboswitches are structured allosteric RNA molecules that change conformation in response to a metabolite binding event, eventually triggering a regulatory response. Computational modelling of the structure of these molecules is complicated by a complex network of tertiary contacts, stabilized by the presence of their cognate metabolite. In this work, we focus on the aptamer domain of SAM-I...

31. Optimising geometric deep learning methods for particle detection challenges in high energy physics experiments.

Matthieu Melennec

10/07/2024 16:20

Contributed Talks

Particle physics experiments like CMS (Compact Muon Solenoid) at the LHC and Super-Kamiokande let us probe the fundamental laws of physics by observing the interaction of high energy particles with various detectors. These particles leave their signatures in different sensors composing these detectors and a host of sophisticated algorithms are employed to reconstruct these particles by...

6. Scattering Spectra models for Physics

Dr Rudy Morel (Flatiron Institute)

10/07/2024 16:40

Contributed talks

Talk (Short)

Contributed Talks

Physicists routinely need probabilistic models for a number of tasks such as parameter inference or the generation of new realizations of a field. Establishing such models for highly non-Gaussian fields is a challenge, especially when the number of samples is limited. In this paper, we introduce scattering spectra models for stationary fields and we show that they provide accurate and robust...

24. Using machine learning to parameterize unresolved processes in climate models: a few thoughts on the example of gravity waves

Riwal Plougonven

11/07/2024 09:30

Talk (Long)

Climate Sciences

Climate models and Numerical Weather Prediction (NWP) Models describe the atmospheric circulation with a
limited resolution. There unavoidably remains processes that involve spatial scales shorter than the
grid scales, ie processes that are unresolved. Cloud processes, turbulence near the surface and internal
gravity waves propagating from lower to upper layers are among the main dynamical...

20. Radar imaging for earth observation and climate science and the key contribution of machine learning

Prof. Florence Tupin (Telecom Paris)

11/07/2024 10:15

Invited talks

Talk (Long)

Climate Sciences

In this talk I will first introduce the basics of radar imaging and present some applications
for climate science. I will then show how machine learning can make a key contribution
to improve radar data degraded by the speckle phenomenon and extract useful information.
I will focus on self-supervised methods allowing for exploiting a wide range of unlabeled data.

22. TBC

Julien Le Sommer

11/07/2024 11:30

Talk (Long)

Climate Sciences

5. Incremental Neural Data Assimilation

Matthieu Blanke (Inria Paris, DI ENS)

11/07/2024 12:15

Talk (Short)

Contributed Talks

Data assimilation is a central problem in many geophysical applications, such as weather forecasting. It aims to estimate the state of a potentially large system, such as the atmosphere, from sparse observations, supplemented by prior physical knowledge. The size of the systems involved and the complexity of the underlying physical equations make it a challenging task from a computational...

25. Some AI challenges in astrophysical imaging

Jerome Bobin

11/07/2024 14:00

Talk (Long)

Astrophysics

Inverse problems are ubiquitous in astrophysics, ranging from image reconstruction to unmixing or unsupervised com-
ponent separation, but they often share common challenges: i) how to deal with ill-posedness, which mandates the design of effective and physically relevant regularisation, ii) how to deal with the deluge of data coming from current and future experiments and iii) quantifying...

11. Galaxy detection with deep learning in radio-astronomical datasets

Dr David Cornu (Observatoire de Paris)

11/07/2024 14:45

Invited talks

Talk (Long)

Astrophysics

Large astronomical facilities generate an ever-increasing data volume, rapidly approaching the exascale, following the need for better resolution, better sensitivity, and larger wavelength coverage. Modern radio astronomy is strongly affected, especially regarding giant radio interferometers that produce large quantities of raw data. In particular, the forthcoming arrival of the SKA (Square...

21. Building new brains for Adaptive Optics on giant optical telescopes.

Damien Gratadour (Observatoire de Paris)

11/07/2024 16:00

Invited talks

Talk (Long)

Astrophysics

The field of experimental astronomy is entering an exciting new era, with the emergence of extremely large telescopes, hosts to primary mirrors the size of several basketball courts. Among the many challenges associated with the construction and operations of such giant scientific infrastructures, the complexity of embedded computing facilities is notably heavy. In particular, the real-time...

18. Empowering neursocience with AI

Dr Bertrand Thirion (Inria)

12/07/2024 10:15

Invited talks

Talk (Long)

Neuroscience

Recent years have witnessed intense interactions between cognitive neuroscience and artificial intelligence, with the deep learning revolution driving new developments in neuroscience.
A first aspect concerns the processing of neuroscience data, which is often in the form of time courses. These data are often short and noisy, and suffer from poorly controlled confounding effects. AI-powered...

15. Exploring language in the brain using Large Language Models

Dr Christophe Pallier (EMR CNRS 9003 & INSERM-CEA Cognitive Neuroimaging Lab U992)

12/07/2024 11:30

Invited talks

Talk (Long)

Neuroscience

Do representations proposed in linguistic theories, such as constituent trees, correspond to actual data structures constructed in real-time in the brain during language comprehension? And if so, what are the brain regions involved? This question was investigated in a series of functional magnetic resonance studies using various experimental paradigms, including repetition priming, syntactic...

7. Large-scale auto-formalization of mathematical theories: why, how, and why now?

Fabian Gloeckle (Ecole des Ponts ParisTech)

12/07/2024 12:15

Contributed talks

Poster

Contributed Talks

A machine readable and verifiable account of a large portion of human mathematics would change the way mathematicians can work, learn and collaborate. While impressive progress has been made in the mathematical standard libraries of proof assistants like Lean, Isabelle and Coq, the proportion of mathematical results formalized in such systems remains tiny overall. In the talk, I will argue...

12. Multi-modal learning for single-cell multi-omics data integration

Dr Laura Cantini (Institut Pasteur)

12/07/2024 14:00

Invited talks

Talk (Long)

Life Sciences

Single-cell data constitute a major breakthrough in life sciences. Their integration will enable us to investigate outstanding biological and medical questions thus far inaccessible. However, still few methods exist to integrate different single-cell modalities, corresponding to omics data (e.g. DNA methylation, proteome, chromatin accessibility), plus spatial positioning and images....

16. AI for health: from prediction to prescription

Dr Judith Abecassis (Inria)

12/07/2024 14:45

Invited talks

Talk (Long)

Life Sciences

The combination of artificial intelligence and the increasing digitization of the health sector opens up perspectives for using data for research and daily decision-making tools for patients and healthcare providers. However, the systematic deployment of these technologies requires better control of their performance, particularly in terms of generalization and explainability. These notions...

23. Natural Language Inference for clinical trials

Nona Naderi

12/07/2024 16:00

Talk (Long)

Life Sciences

2. A nonlinear reduced basis approximation of discrete contact problems in crowd motion

Giulia Sambataro (École des Ponts ParisTech)

Poster

In this work we adapt recent model reduction approaches to predict the solutions of
time-dependent parametrized problems describing crowd motion in the presence of ob-
stacles. The problem of interest is a discrete contact model, which is formulated as a
constrained least-squares optimization statement. The parametric variations in the prob-
lem (associated with the geometric configuration...

9. Optimizing Markov Chain Monte Carlo Convergence with Normalizing Flows and Gibbs Sampling

Christoph Schönle (CMAP, Ecole Polytechnique)

Poster

Generative models have started to integrate into the scientific computing toolkit. One notable instance of this integration is the utilization of normalizing flows (NF) in the development of sampling and variational inference algorithms. This work introduces a novel algorithm, GflowMC, which relies on a Metropolis-within-Gibbs framework within the latent space of NFs. This approach addresses...

Choisissez le fuseau horaire

Sacl-AI 4 Science Workshop