I'm a final-year PhD student in natural language processing at the University of Copenhagen, where I'm advised by Prof. Anders Søgaard.
My research broadly focuses on the development of multimodal (text, audio, vision) and multilingual language models via self-supervised representation learning and post-training techniques. I am also interested in topics related to tokenization, privacy-preserving ML, interpretability, and model merging.
Contrary to popular belief, I am not related to the programming language, the video game, or the home of the Europa-Park.
Experience
Applied Scientist Intern — Amazon AGI
Manager: Marius Cotescu
Post-training for Amazon Nova Speech-to-Speech. Research on modular and multi-objective preference optimization for conversational speech in multimodal LLMs.
Research Scientist Intern — Meta (FAIR)
Manager: Jean Maillard
Research on self-supervised video pretraining for large-scale sign language translation.
Publications
EMNLP 2023
PHD: Pixel-Based Language Modeling of Historical Documents
Nadav Borenstein, Phillip Rust, Desmond Elliott, Isabelle Augenstein
EMNLP 2023
Text Rendering Strategies for Pixel Language Models
Jonas Lotz, Elizabeth Salesky, Phillip Rust, Desmond Elliott
ACL 2022
Challenges and Strategies in Cross-Cultural NLP
Daniel Hershcovich, Stella Frank, Heather Lent, Miryam de Lhoneux, Mostafa Abdou, Stephanie Brandl, Emanuele Bugliarello, Laura Cabello Piqueras, Ilias Chalkidis, Ruixiang Cui, Constanza Fierro, Katerina Margatina, Phillip Rust, Anders Søgaard
ICFHR 2022
Date Recognition in Historical Parish Records
Constanza Fierro, Laura Cabello Piqueras, Jonas F. Lotz, Phillip Rust, Joen Rommedahl, Jeppe Klok Due, Christian Igel, Desmond Elliott, Carsten Bøcker Pedersen, Israfel Salazar, Anders Søgaard
Education
University of Copenhagen
Ph.D. in Computer Science
Advisor: Prof. Anders Søgaard
Technical University of Darmstadt
M.Sc. in Computer Science
Advisor: Jonas Pfeiffer & Prof. Iryna Gurevych
DHBW Stuttgart
B.Sc. in Computer Science