Skip to main content

· One min read

Mixture-of-Experts (MoE) is a powerful way to scale large language models (LLMs): instead of running the full model for every token, a router activates only a few "experts," giving more capacity at roughly the same compute.

But routing is still a sore spot. Most MoE systems use Top-k + Softmax, where expert selection is discrete—so you don't get clean end-to-end gradients. In practice, this can lead to unstable routing, calibration issues, and uneven expert usage.

In our ICLR 2026 paper, we introduce DirMoE — a fully differentiable probabilistic router that separates which experts fire (Bernoulli) from how their weights are assigned (Dirichlet). We also add a simple "sparsity knob" (Simpson-index penalty) to control the expected number of active experts, without relying on load-balancing losses that can homogenize experts.

Results: DirMoE matches or exceeds vanilla MoE throughput (no extra bottlenecks), is strong/competitive on zero-shot benchmarks (ARC, BoolQ, PIQA, …), and leads to clearer expert specialization (interpretable domain focus like ArXiv/Books/GitHub code).

DirMoE: Dirichlet-Routed Mixture of Experts — disentangling expert selection (Bernoulli) from expert contribution (Dirichlet)

Led by Hesam Asadollahzadeh and Amirhossein Vahidi.

Read the paper on OpenReview · Thread on X

· One min read

I am excited to join ELLIS - the European Laboratory for Learning and Intelligent Systems. Very grateful to fellows who supported me.

MDSI award.

About ELLIS:

ELLIS is a pan-European AI network of excellence which focuses on fundamental science, technical innovation and societal impact. Founded in 2018, ELLIS builds upon machine learning as the driver for modern AI and aims to secure Europe’s sovereignty in this competitive field by creating a multi-centric AI research laboratory. ELLIS wants to ensure that the highest level of AI research is performed in the open societies of Europe and follows a three-pillar strategy to achieve that.

· One min read

I am excited that our work from my time at Meta AI and collaboration with Helmholtz Munich on modeling single-cell perturbations (e.g., drugs, disease, CRISPR manipulations) is now featured on the cover of Molecular systems biology. Thanks to all collaborators and my co-authors for making this happen.

MDSI award.

About the paper:

Lotfollahi, M+., Klimovskaia Susmelj+, A., De Donno, C+., Hetzel, L., Ji, Y., Ibarra, I. L., ... & Theis, F. J.

[Molecular Systems Biology (2023)], [code], [Facebook AI blogpost], [state of AI report 2021], [featured cover].

· One min read

I am greatly honored by Bayer Foundation by awarding me the "Early Excellence in Science Award" and recognizing my work on "developing machine learning algorithms to understand large-scale single-cell omics data in health and disease to ultimately facilitate the advancement of precision medicine and AI-assisted drug discovery." Big thanks to my family, previous and current collaborators, and mentors.

Watch the award video here (click in the image)

laudation vicdoe

[Award Ceremony, [Read about the award].

· One min read

Proud that our paper on life-long and transfer learning for single-cell biology has been awarded among top 3 papers selected by anonymous reviewers from MDSI at Technical University of Munich. This was among the work done during my doctoral studies at Helmholtz Munich and Life science school at TUM.

MDSI award.

About the paper:

Mapping single-cell data to reference atlases by transfer learning

Lotfollahi, M., Naghipourfar, M., Luecken, M. D., Khajavi, M., Büttner, M., Wagenstetter, M., Avsec, Ž., Gayoso, A., Yosef, N., Interlandi, M. & Others.

[Nature Biotechnology (2022)], [code], [MDSI best paper award], [featured cover in Nature Biotechnology].