I'm a final-year PhD student in natural language processing at the University of Copenhagen, where I'm advised by Prof. Anders Søgaard.

My research broadly focuses on the development of multimodal (text, audio, vision) and multilingual language models via self-supervised representation learning and post-training techniques. I am also interested in topics related to tokenization, privacy-preserving ML, interpretability, and model merging.

Contrary to popular belief, I am not related to the programming language, the video game, or the home of the Europa-Park.

Experience

08/24 – 02/25Cambridge, UK

Applied Scientist Intern Amazon AGI

Manager: Marius Cotescu

Post-training for Amazon Nova Speech-to-Speech. Research on modular and multi-objective preference optimization for conversational speech in multimodal LLMs.

06/23 – 12/23Menlo Park, USA

Research Scientist Intern Meta (FAIR)

Manager: Jean Maillard

Research on self-supervised video pretraining for large-scale sign language translation.

Publications

ACL 2024

Towards Privacy-Aware Sign Language Translation at Scale

Phillip Rust, Bowen Shi, Skyler Wang, Necati Cihan Camgöz, Jean Maillard

HuCLLM Workshop @ ACL 2024

Vision-Language Models under Cultural and Inclusive Considerations

Antonia Karamolegkou, Phillip Rust, Yong Cao, Ruixiang Cui, Anders Søgaard, Daniel Hershcovich

EMNLP 2023

PHD: Pixel-Based Language Modeling of Historical Documents

Nadav Borenstein, Phillip Rust, Desmond Elliott, Isabelle Augenstein

EMNLP 2023

Text Rendering Strategies for Pixel Language Models

Jonas Lotz, Elizabeth Salesky, Phillip Rust, Desmond Elliott

ICLR 2023

⭐ Notable-top-5% (Oral)

Language Modelling with Pixels

Phillip Rust, Jonas F. Lotz, Emanuele Bugliarello, Elizabeth Salesky, Miryam de Lhoneux, Desmond Elliott

ICML 2023

Differential Privacy, Linguistic Fairness, and Training Data Influence: Impossibility and Possibility Theorems for Multilingual Language Models

Phillip Rust, Anders Søgaard

ACL 2022

Challenges and Strategies in Cross-Cultural NLP

Daniel Hershcovich, Stella Frank, Heather Lent, Miryam de Lhoneux, Mostafa Abdou, Stephanie Brandl, Emanuele Bugliarello, Laura Cabello Piqueras, Ilias Chalkidis, Ruixiang Cui, Constanza Fierro, Katerina Margatina, Phillip Rust, Anders Søgaard

ICFHR 2022

Date Recognition in Historical Parish Records

Constanza Fierro, Laura Cabello Piqueras, Jonas F. Lotz, Phillip Rust, Joen Rommedahl, Jeppe Klok Due, Christian Igel, Desmond Elliott, Carsten Bøcker Pedersen, Israfel Salazar, Anders Søgaard

ACL 2021

How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models

Phillip Rust, Jonas Pfeiffer, Ivan Vulic, Sebastian Ruder, Iryna Gurevych

ACL 2020

PuzzLing Machines: A Challenge on Learning From Small Data

Gözde Gül Şahin, Yova Kementchedjhieva, Phillip Rust, Iryna Gurevych

Education

2021 – Present

University of Copenhagen

Ph.D. in Computer Science

Advisor: Prof. Anders Søgaard

2018 – 2020

Technical University of Darmstadt

M.Sc. in Computer Science

Advisor: Jonas Pfeiffer & Prof. Iryna Gurevych

2015 – 2018

DHBW Stuttgart

B.Sc. in Computer Science