Open Source ML Developer and Researcher | Foundational Models for Generalization, Reasoning, & Generative AI
Personal Website |
Twitter |
LinkedIn |
Google Scholar |
ORCID |
β TorchEBM Blog |
TDS Articles |
Medium |
Email
I received my MSc in Artificial Intelligence at the University of Essex, supervised by Prof. L. Citi. Previously, I obtained my bachelor's degree from the University of Kurdistan under the supervision of Dr P. Moradi.
During my postgraduate studies, I worked on reasoning in LLMs for code generation, through which, I developed "Neural Integration of Iterative Reasoning (NIR) for Code Generation". It consisted of a separate deep-think stage with self-reflection and a direct integration of thoughts (hidden states) into the LLM's main generation's hidden states using model surgery!
My ultimate objective is to study the cognitive mechanisms underlying intelligence and develop agents capable of reasoning and interacting with the real world.
π I am actively seeking AI/ML Researcher/Engineer roles / Internships where I can contribute to cutting-edge research in AI. I'm also looking for PhD positions (for the upcoming academic year).
A brief overview of my contributions and activities over the years

My research interests focus on building more capable, efficient, and generalizable AI systems. I am particularly interested in the intersection of deep learning, statistical mechanics, probabilistic modeling, and geometric methods.
Key areas of interest include (π’ click to expand)
- Generative Modeling: Developing and understanding Energy-Based Models (EBMs, e.g.,
TorchEBM
), Diffusion/Score-Based Models, Normalizing Flows & Optimal Transport, and Consistency Models. This involves leveraging mathematical tools such as Ordinary Differential Equations (ODEs), Partial Differential Equations (PDEs, e.g., Fokker-Planck), and Stochastic Differential Equations (SDEs) to define, train, and sample from these models. My work also incorporates concepts from Stochastic Calculus (e.g., ItΓ΄ Calculus) and Optimal Transport Theory.
Parallel to these, I have experience in developing language models and autoregressive generation for sequential data. - Generalization, Reasoning & Planning: Creating models that exhibit robust OOD performance for complex decision-making.
- Geometric & Mathematical Foundations of ML: Applying Differential Geometry (i.e. Riemannian manifolds), Metric Learning, and insights from different reformulations (i.e. Hamiltonian & Lagrangian mechanics) to design more theoretically-solid and efficient learning algorithms.
- Efficient Architectures: Transformers & Attention Mechanisms
- RL & Agents
- Embodied Intelligent Agents
- Applications: AI for science, robotics, and LLMs
My current flagship project is β TorchEBM
, alongside ongoing research into advanced generative modeling techniques.
Project |
Description |
---|---|
π TorchEBM |
PyTorch library for Energy-Based (and Diffusion under dev) Models; simplifies training (CD, Score Matching), sampling, and research. Official Website and Docs |
π NIR |
MSc Dissertation: Integrating contexts from iterative reasoning directly into LLMs' hidden states for enhanced code generation. Project Website |
π TransformerX |
Library for building and experimenting with transformer-based models and LLMs. |
π Emgraph |
Library for knowledge graph embeddings: development, training, and evaluation. |
π Bigraph | A library that extends some of the link prediction algorithms for bi-partite graphs. |
Other: π TASE, Nano automatic differentiation framework, EfficientCoF, Make-a-Video (partially), P2P quantum-messaging (conceptual)
π’ Click to Expand
- Generative AI & ML:
- EBMs, Diffusion Models, Normalizing Flows, and Probabilistic Modeling.
- Transformers and LLMs (Autoregressive Gen)
- Representation Learning. - Languages & Frameworks: PyTorch, TensorFlow, JAX (Familiar), Hugging Face Transformers.
- Math & Algorithms: Optimization, MCMC (Langevin, HMC), Calculus, Linear Algebra, Probability, Statistics, elements of Stochastic Calculus, applied Differential Geometry, Optimal Transport.
- Software & Tools: Git, API Design, TDD principles, Docker (Familiar), CI/CD (Familiar through GH Actions), HPC, and GPU Programming (Basic)