Skip to content
View soran-ghaderi's full-sized avatar

Highlights

  • Pro

Organizations

@appheap @bi-graph @tensorops

Block or report soran-ghaderi

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
soran-ghaderi/README.md

Soran Ghaderi

Open Source ML Developer and Researcher | Foundational Models for Generalization, Reasoning, & Generative AI
Personal Website | Twitter | LinkedIn | Google Scholar | ORCID | βˆ‡ TorchEBM Blog | TDS Articles | Medium | Email

πŸ“ Open to research collaborations on the following topics (please reach out via email).

About Me

I received my MSc in Artificial Intelligence at the University of Essex, supervised by Prof. L. Citi. Previously, I obtained my bachelor's degree from the University of Kurdistan under the supervision of Dr P. Moradi.

During my postgraduate studies, I worked on reasoning in LLMs for code generation, through which, I developed "Neural Integration of Iterative Reasoning (NIR) for Code Generation". It consisted of a separate deep-think stage with self-reflection and a direct integration of thoughts (hidden states) into the LLM's main generation's hidden states using model surgery!
My ultimate objective is to study the cognitive mechanisms underlying intelligence and develop agents capable of reasoning and interacting with the real world.

πŸ“ I am actively seeking AI/ML Researcher/Engineer roles / Internships where I can contribute to cutting-edge research in AI. I'm also looking for PhD positions (for the upcoming academic year).

A brief overview of my contributions and activities over the years

image

Research Interests

My research interests focus on building more capable, efficient, and generalizable AI systems. I am particularly interested in the intersection of deep learning, statistical mechanics, probabilistic modeling, and geometric methods.

Key areas of interest include (🟒 click to expand)
  • Generative Modeling: Developing and understanding Energy-Based Models (EBMs, e.g., TorchEBM), Diffusion/Score-Based Models, Normalizing Flows & Optimal Transport, and Consistency Models. This involves leveraging mathematical tools such as Ordinary Differential Equations (ODEs), Partial Differential Equations (PDEs, e.g., Fokker-Planck), and Stochastic Differential Equations (SDEs) to define, train, and sample from these models. My work also incorporates concepts from Stochastic Calculus (e.g., ItΓ΄ Calculus) and Optimal Transport Theory.

    Parallel to these, I have experience in developing language models and autoregressive generation for sequential data.
  • Generalization, Reasoning & Planning: Creating models that exhibit robust OOD performance for complex decision-making.
  • Geometric & Mathematical Foundations of ML: Applying Differential Geometry (i.e. Riemannian manifolds), Metric Learning, and insights from different reformulations (i.e. Hamiltonian & Lagrangian mechanics) to design more theoretically-solid and efficient learning algorithms.
  • Efficient Architectures: Transformers & Attention Mechanisms
  • RL & Agents
  • Embodied Intelligent Agents
  • Applications: AI for science, robotics, and LLMs
Current Focus: I am actively exploring the unification of diffusion, flow-based, and energy-based models, through the lens of statistical mechanics, differential geometry, and various other mathematical framworks. The goal is to design generalizable intelligent agents capable of sophisticated and fast generation and reasoning (viewed as an optimization or inference problem) and robust planning under uncertainty.

Open Source Projects

My current flagship project is βˆ‡ TorchEBM, alongside ongoing research into advanced generative modeling techniques.

Project GitHub Description
πŸ“ TorchEBM
Pytorch
PyTorch library for Energy-Based (and Diffusion under dev) Models; simplifies training (CD, Score Matching), sampling, and research. Official Website and Docs
πŸ“• NIR
Pytorch
MSc Dissertation: Integrating contexts from iterative reasoning directly into LLMs' hidden states for enhanced code generation. Project Website
πŸ“š TransformerX
Tensorflow
Library for building and experimenting with transformer-based models and LLMs.
πŸ“š Emgraph
Tensorflow
Library for knowledge graph embeddings: development, training, and evaluation.
πŸ“š Bigraph A library that extends some of the link prediction algorithms for bi-partite graphs.

Other: πŸ“š TASE, Nano automatic differentiation framework, EfficientCoF, Make-a-Video (partially), P2P quantum-messaging (conceptual)

Skills and Experiences (Broadly!)

🟒 Click to Expand
  • Generative AI & ML:
    - EBMs, Diffusion Models, Normalizing Flows, and Probabilistic Modeling.
    - Transformers and LLMs (Autoregressive Gen)
    - Representation Learning.
  • Languages & Frameworks: PyTorch, TensorFlow, JAX (Familiar), Hugging Face Transformers.
  • Math & Algorithms: Optimization, MCMC (Langevin, HMC), Calculus, Linear Algebra, Probability, Statistics, elements of Stochastic Calculus, applied Differential Geometry, Optimal Transport.
  • Software & Tools: Git, API Design, TDD principles, Docker (Familiar), CI/CD (Familiar through GH Actions), HPC, and GPU Programming (Basic)

Profile Summary

Pinned Loading

  1. torchebm torchebm Public

    πŸ“ Build and train energy-based and diffusion models in PyTorch ⚑.

    Python 12 1

  2. tensorops/TransformerX tensorops/TransformerX Public

    Flexible Python library providing building blocks (layers) for reproducible Transformers research (Tensorflow βœ…, Pytorch πŸ”œ, and Jax πŸ”œ)

    Python 53 8

  3. bi-graph/Emgraph bi-graph/Emgraph Public

    A Python library for knowledge graph representation learning (graph embedding).

    Python 39 3

  4. bi-graph/Bigraph bi-graph/Bigraph Public

    Bipartite-network link prediction in Python

    Python 94 21

  5. make-a-video make-a-video Public

    "Make-A-Video", new SOTA text to video by Meta-FAIR - Tensorflow

    Python 14 2

  6. appheap/TASE appheap/TASE Public

    TASE (Telegram Audio Search Engine): A lightning fast audio full-text search engine on top of Telegram

    Python 11 1