Avatar
README.md

About Me

I am an NLP Researcher, and Quant at G-Research.

Before that, I did my PhD at Sorbonne Université, while being a Research Engineer at BNP Paribas. My PhD, Deep Learning for Data-to-Text Generation was done under the supervision of Patrick Gallinari and Laure Soulier, from the MLIA team. All projects (solos & duos) are available on Github and ArXiv.

Work

I work as a Quant Researcher, meaning I research systematic trading ideas to predict the future of financial markets, applying scientific techniques to find patterns in large, noisy and rapidly changing realworld data sets. In otherwords, I apply and develop state-of-the-art NLP approaches (read transformers) to find trading signals in large textual corpora. By making computers do the trading, we remove human error and make sure only rigourously proven-to-work strategies are deployed.

Before that, I was a Research Engineer at BNP Paribas. In practice, I bridged the gap between research/academia and applications/enterprise, being part of the team which developped the internal company-wide search engine, as well as a number of other tools (translation plateform, document NLP, etc.).

Academic Research

Right now, I am interested in all things NLP.

During my PhD, I worked on Data-to-Text Generation (DTG), i.e. building systems able to:

  • comprehend complex structured data (e.g. tables, graphs, etc.);
  • produce a fitting description (from one sentence to several paragraphs).

These systems are crucial in environments where raw data is abundant, but hardly usable as is (e.g. health, sports, etc.), because end-users are more effective when provided with textual summaries than structured data1.

My PhD work has been focused on a critical aspect of DTG: ensuring factualness in system outputs. Neural networks have proven shockingly good at producing fluent texts, but end-users care more about accuracy than about readability2. Wrong descriptions that must be revised by human experts are of limited utility. In this direction, I have proposed novel encoding neural modules that are better suited for complex structured data, evaluation protocols that can better discriminate between models by leveraging structured data, and training procedures that ensure models don’t pick up on biased human behaviours (such as mentioning unverifiable facts).

In 2021, I have focused on working with other PhD students, with notably fruitful collaborations with University of Turin (Italy), University of Aberdeen (UK), as well as Sorbonne Université (France).

Hobbies

On a personal note, I am a climbing enthusiast and try to swim at least once per week. I greatly enjoy storytelling, both reading and going to the movies (used to go twice a week w/ movie pass before I moved to London). I’m also a fan of cooking: meals, deserts, as well as cocktails 🍹 See the Gallery Section for some proof that I go outside!

1: From data to text in the Neonatal Intensive Care Unit: Using NLG technology for decision support and information management. Gatt et al. 2009
2: An Investigation into the Validity of Some Metrics for Automatically Evaluating Natural Language Generation Systems. Belz and Reiter 2009