I am a Ph.D. student in Computer Science at Stanford University, working on NLP and ML.
Nowadays I mostly think about how to better decouple knowledge from capabilities in language models. Can we build smaller models that don't memorize half the internet, yet can reason and perform complex tasks? I am also interested in multimodal representaion learning, and in principled and effective methods to contexualize models with external information.
Before Stanford, I was working as an engineer at Meta.
I completed my B.Sc. and M.Sc. in Computer Science at Ben-Gurion University, where I worked on evaluation of tokenization algorithms for language models.
Feel free to reach out!