Research Scientist in Deep Learning at Skysoft ATM
PhD in François Fleuret's Machine Learning GroupMail: | arn.pannatier@gmail.com |
Software: | http://github.com/arnaudpannatier |
Publications: | Google Scholar |
Chess.com account: | arnaudpannatier |
Twitter: | @ArnaudPannatier |
Hi ! I'm Arnaud. I'm a Research Scientist in Deep Learning at Skysoft ATM in Geneva Switzerland. My current work is on using LLM for automated compliance and software quality. I'm also developping a wind nowcasting tool for the aviation industry.
Before that, I started my PhD in François Fleuret's Machine Learning group in March 2020 and I successfully defended it in August 2024. My last publication was σ-GPTs, a novel way to train GPT in a non left-to-right order. The main idea is to train the model to generate sequences in a random order, which allow conditional density estimation, infilling and generating sequences by burst using a novel rejection sampling method. We describe it in a short Twitter Thread (seen 362k times).
In exploring that idea, we aslo compared to a discrete diffusion baseline, which also allows to generate sequences in burst. We were surprised to see that diffusion models were able to solve path-finding task and we made a short Twitter thread about it (seen 194k times).
In the first part of my Ph.D, I am trying to forecast wind at high-altitude based on live data. Current forecasts given by National Weather agencies are too sparse in time and space to be reliable to manage air traffic. We are trying to solve that problem by nowcast the wind based on the last aircraft's measurements. We first introduceda GPU-accelerated smart kernel averaging method, published at SIAM Data Mining 2022. We then noticed that sets of measurements are naturally modelled by attention-based models, and we introduced a single stack transformer encoder which takes as input both the aircraft's measurements and the query point to predict, processing them at the same time using full attention.
I also worked on HyperMixer, which is based on Florian Mai's idea to use hypernetworks to enable MLPMixer to handle various length inputs. This allowed the model to handle inputs in a permutation-invariant manner, and we showed that it gave the model a kind of attention behavior which scales linearly with the input length.
Before my Ph.D., I made a Bachelor in Physics followed by a Master in Computer Sciences and Engineering (CSE - MATH), both at EPFL. I recieved the Kudelski Award for my Master Thesis A control plane in time and space for locality preserving blockchains in the Decentralized Distributed Systems Laboratory .
A. Pannatier, E. Courdier, and F. Fleuret
σ-GPTs: A New Approach to Autoregressive Models.
In Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD), 2024.
demo web /
arxiv /
twitter thread
A. Pannatier, K. Matoba, and F. Fleuret
Inference from Real-World Sparse Measurements.
In Transactions on Machine Learning Research (TMLR), 2024.
openreview /
arxiv /
code
F. Mai, A. Pannatier, F. Fehr, H. Chen, F. Marelli, F. Fleuret, and J. Henderson
HyperMixer: An MLP-based Low Cost Alternative to Transformers.
In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), 2023.
publication /
arxiv /
code /
video: AI Coffee Break interview
A. Pannatier, R. Picatoste, and F. Fleuret.
Efficient Wind Speed Nowcasting with GPU-Accelerated Nearest
Neighbors Algorithm.
In Proceedings of the SIAM International Conference on Data
Mining (SDM), 2022.
publication /
arxiv /
slides