Rami Al-Rfou

Rami Al-Rfou

Member of Technical Staff - TLM

OpenAI

Biography

Rami Al-Rfou is a Member of Technical Staff at OpenAI, where he is a founding member of the robotics effort and leads work on embodied foundation models.

Previously, Rami was a Senior Staff Research Scientist at Waymo Research, where he led foundational motion modeling work for forecasting and planning. His team developed scalable transformer-based approaches for motion prediction, scaling laws, and efficient distillation methods for deployment.

Before Waymo, Rami was a Staff Research Scientist at Google Research. He led and contributed to multilingual and token-free language modeling work including mT5, ByT5, and prompt tuning, and helped deploy assisted writing systems such as SmartReply and SmartCompose across Google products.

Rami received his PhD in Computer Science from Stony Brook University under the supervision of Prof. Steven Skiena. His research has focused on large-scale representation learning across language, graphs, and embodied systems.

Experience

 
 
 
 
 

Member of Technical Staff - TLM

OpenAI

Jan 2024 – Present San Francisco, CA

Responsibilities include:

  • Founding member of OpenAI robotics; helped define strategy, scope, and technical direction
  • Built and led a robotics ML team spanning model, data, and evaluation
  • Led early embodied scaling studies, tokenization design, and offline evaluation methodology
  • Partnered across hardware, software, operations, and leadership to deliver end-to-end milestones
 
 
 
 
 

Senior Staff Research Scientist

Waymo Research

Mar 2021 – Jan 2024 Mountain View, CA

Responsibilities include:

  • Led foundational motion modeling research for forecasting and planning
  • Developed scaling laws for open-loop and closed-loop motion metrics
  • Designed efficient multimodal behavior models and distillation methods for deployment
  • Managed and grew the foundational models research team
 
 
 
 
 

Staff Research Scientist

Google Research

Jun 2015 – Mar 2021 Mountain View, CA

Responsibilities include:

  • Led multilingual LLM research including mT5 and ByT5
  • Co-developed parameter-efficient prompt tuning methods
  • Technical lead for multilingual SmartReply and SmartCompose systems
  • Led deep retrieval and retrieval-augmented language modeling efforts
 
 
 
 
 

Research Intern

Microsoft Research

Jun 2013 – Aug 2013 New York City, NY
Host: Leon Bottou
“Investigated new ways to improve semi-supervised learning with word embeddings.”
 
 
 
 
 

Research Intern

Google Research

Jun 2012 – Aug 2012 Mountain View, CA
Host: Jay Ponte
“Developed a language-independent, semi-supervised method for multilingual coreference resolution utilizing word emebddings and finetuned dual-encoder ranking model.”
 
 
 
 
 

Software Engineer Intern

Google

Jun 2011 – Aug 2011 Mountain View, CA
Host: Mario Guajardo
“Developed a visualization system for Google’s data centers' internal networks.”

Education

 
 
 
 
 

PhD in Natural Language Processing

Stony Brook University

Sep 2010 – Jun 2015 Stony Brook, NY
Dissertation: Polyglot: A Massive Multilingual Natural Language Processing Pipeline. Adviser: Steven Skiena.
Committee: Yejin Choi, Leman Akoglu, Leon Bottou
 
 
 
 
 

BSc. in Computer Engineering

University of Jordan

Sep 2004 – Feb 2009 Amman, Jordan
Dissertation: TCP Performance over Wireless Networks: Analysis & Simulation.
GPA: 3.79/4.0

Recent Publications

Quickly discover relevant content by filtering publications.
Scaling Laws of Motion Forecasting and Planning

Scaling Laws of Motion Forecasting and Planning

Scaling behavior for motion forecasting and planning models in autonomous driving.

MoST: Multi-modality Scene Tokenization for Motion Prediction

MoST: Multi-modality Scene Tokenization for Motion Prediction

Multi-modality scene tokenization for motion prediction.

Scaling Motion Forecasting Models with Ensemble Distillation

Scaling Motion Forecasting Models with Ensemble Distillation

Distillation methods for scaling motion forecasting models.

WOMD-LiDAR: Raw Sensor Dataset Benchmark for Motion Forecasting

WOMD-LiDAR: Raw Sensor Dataset Benchmark for Motion Forecasting

A raw sensor benchmark for motion forecasting.

MotionLM: Multi-Agent Motion Forecasting as Language Modeling

MotionLM: Multi-Agent Motion Forecasting as Language Modeling

Multi-agent motion forecasting as language modeling.

Wayformer: Motion Forecasting via Simple and Efficient Attention Networks

Wayformer: Motion Forecasting via Simple and Efficient Attention Networks

Efficient attention architecture for motion forecasting.

Narrowing the Coordinate-Frame Gap in Behavior Prediction Models

Narrowing the Coordinate-Frame Gap in Behavior Prediction Models

Distillation for efficient and accurate scene-centric motion forecasting.

ByT5: Towards a Token-Free Future with Pre-trained Byte-to-Byte Models

ByT5: Towards a Token-Free Future with Pre-trained Byte-to-Byte Models

Token-free byte-to-byte language modeling at scale.

nmT5: Is Parallel Data Still Relevant for Pre-training Massively Multilingual Language Models?

nmT5: Is Parallel Data Still Relevant for Pre-training Massively Multilingual Language Models?

Parallel data and pre-training dynamics for massively multilingual LMs.

Patents

  • Parameter efficient prompt tuning for efficient models at scale
    US Patent 12,524,711 (2026)

  • Trajectory prediction using efficient attention neural networks
    US Patent 12,497,079 (2025)

  • Adapting foundation models for autonomous driving
    US Patent App. 19/209,351 (2025)

  • Scene tokenization for motion prediction
    US Patent App. 18/950,830 (2025)

  • Behavior prediction using scene-centric representations
    US Patent App. 18/913,074 (2025)

View full patents archive

Contact