DEPTH
0
1,500
3,000
4,500
6,000
7,500
9,000
10,500
12,000
50 m
N 43°39′ · W 79°23′ · Toronto

ErfanMiahi

Currently

Founding Research Engineer · Covenant AI. Working on off-policy RLVR, decentralized post-training, and the strange limits of what a large model can learn from a thousand untrusted peers.

based
Toronto, Canada
formerly
RLAI · Sutton's lab
field
Reinforcement Learning
tagline
Research Engineer / Scientist · Post-training · Reasoning
02 Now

Current conditions.

Pressure: 11,000 psi · Last ping: April 18, 2026

2026 · Q2

Leading the RL post-training team at Covenant AI · comms efficiency and off-policiness.

This month

Shipping a new trainer-to-trainer comms algorithm. (The weight-comms paper is already out, cited by Fireworks × Cursor for Composer 2.)

Reading

Nietzsche · Thus Spoke Zarathustra. Re-reading Mathematics for Machine Learning.

Training

34 skydive jumps logged. Target: 200 by end of summer. Wind tunnels at iFLY in between.

03 About

A brief dispatch.

Portrait of Erfan Miahi, Founding Research Engineer · Covenant AI

I 'm a research engineer & scientist working on the hard parts of post-training: off-policy RL, verifiable rewards, and decentralized training infrastructure. Nine years in AI, five shipping production ML.

I did my M.Sc. at the University of Alberta's RLAI Lab under Martha White and Marlos C. Machado. Collaborated with Google DeepMind. Published across EMNLP, AIJ, MLJ, TMLR, and IEEE T-Cyb. Ten papers, 212 citations.

I'm a founding research engineer at Covenant AI, where we pre-trained Covenant-72B, a 72B LLM trained across trustless peers over the open internet. My weight-update-sparsity paper (arXiv 2602.03839, ~100× bandwidth efficiency) is cited by Fireworks in their globally-distributed training of Cursor's Composer 2. Before Covenant, founding ML research engineer at DeepR Analytics; interviewed at YC S24 (top 7%).

When I surface from the work: 34 skydive jumps and climbing toward 200 by end of summer, wind tunnels at iFLY between weekends, parkour when a city asks for it, long-distance running when it doesn't. The body moves as much as the mind. I'm drawn to the art of movement and to anything that asks you to meet the edge honestly. The two thinkers I keep returning to are Nietzsche and Jung. I've mentored 10+ students since 2017, mostly on AI research and career.

212
Citations
10
Publications
34
Skydives
1st
Meta Llama Toronto '24
72B
Covenant pre-train
10+
Mentees since 2017
04 Research

Publications & preprints.

EMNLP · AIJ · MLJ · TMLR · DeepMind collaboration · 212 citations across ten papers.

  1. 2026

    Understanding and Exploiting Weight Update Sparsity for Communication-Efficient Distributed RL

    Erfan Miahi, Eugene Belilovsky

    First author

    arXiv
    preprint
  2. 2026

    How Reliable are Confidence Estimators for Large Reasoning Models?

    Reza Khanmohammadi, Erfan Miahi, Sahar Kaur, Charese Smiley, Isabella Brugere

    European Chapter of the ACL

    EACL
    main
  3. 2026

    Covenant-72B: Pre-Training a 72B LLM with Trustless Peers Over-the-Internet

    Joel Lidin, Amir Sarfi, Erfan Miahi, Quentin Anthony, Suraj Chauhan, Eugenios Pappas

    Largest decentralized pre-train to date

    arXiv
    preprint
  4. 2025

    Calibrating LLM Confidence by Probing Perturbed Representation Stability

    Reza Khanmohammadi, Erfan Miahi, Mardikoraem, Kaur, Brugere

    Empirical Methods in Natural Language Processing

    EMNLP
    main
  5. 2024

    Investigating the Properties of Neural Network Representations in Reinforcement Learning

    Han Wang, Erfan Miahi, Martha White, Marlos C. Machado, Zaheer Abbas, Raksha Kumaraswamy

    DeepMind collaboration · Artificial Intelligence Journal

    AIJ
    journal
  6. 2024

    GVFs in the Real World: Making Predictions Online for Water Treatment

    Muhammad K. Janjua, Haseeb Shah, Martha White, Erfan Miahi, Marlos C. Machado, Adam White

    Machine Learning Journal

    MLJ
    journal
  7. 2023

    ResMax: An Alternative Soft-Greedy Operator for Reinforcement Learning

    Erfan Miahi, Revan MacQueen, Alex Ayoub, Abbas Masoumzadeh, Martha White

    Transactions on Machine Learning Research

    TMLR
    journal
  8. 2022

    Genetic Neural Architecture Search for Automatic Assessment of Human Sperm Images

    Erfan Miahi, S.A. Mirroshandel, A. Nasr

    First author · NAS for medical imaging

    Expert Systems w/ Apps.
    journal
  9. 2022

    Scalable Transfer Evolutionary Optimization: Coping with Big Task Instances

    Mojtaba Shakeri, Erfan Miahi, Abhishek Gupta, Yew-Soon Ong

    NTU × A*STAR collaboration

    IEEE T-Cyb
    journal
  10. 2021

    Effect of Deep Transfer and Multi-task Learning on Sperm Abnormality Detection

    A. Abbasi, Erfan Miahi, S.A. Mirroshandel

    Most-cited paper · Computers in Biology and Medicine

    Comp. Bio. & Med.
    journal
07 Reading

Books at depth.

47 read · 20 currently reading · 176 on the shelf · 4.86 avg rating on Goodreads.

Currently reading

Accelerate

Nicole Forsgren
engineering

The Startup of You

Reid Hoffman
entrepreneurship

Team Geek

Brian Fitzpatrick
engineering
Accelerate
Nicole Forsgren
NOW
The Startup of You
Reid Hoffman
NOW
Team Geek
Brian Fitzpatrick
NOW
On Intelligence
Jeff Hawkins
Five Dialogues
Plato
A Little History of the World
E.H. Gombrich
The Denial of Death
Ernest Becker
Memories, Dreams, Reflections
C.G. Jung
Good to Great
Jim Collins
The Art of Being
Erich Fromm
The Last Lecture
Randy Pausch
Art History
Dana Arnold
How to Win Friends…
Dale Carnegie
See full shelf on Goodreads ↗
47read 20in progress 176to read 4.86avg ★
The secret for harvesting from existence the greatest fruitfulness and the greatest enjoyment is: to live dangerously.
Friedrich Nietzsche · Die fröhliche Wissenschaft · §283
08 Mentorship

A small apprenticeship.

Since 2017 I've mentored 10+ students through monthly or bi-weekly meetings, mapping passion to path. One mentee went from a regional Iranian university to a PhD at Michigan State.

How to apply

Email with 'Mentorship Program' in the subject line. Address is in the Contact section below.

You might be a fit if
  1. 01

    You want to do AI research or engineering, and want an honest sparring partner.

  2. 02

    You're willing to show up consistently, even when the work stops being exciting.

  3. 03

    You'd rather be told the truth than something comfortable.

Surface contact

Write a better
letter than
the usual one.

mhi.erfan1 · at · gmail · com CV · PDF ↗