Hubbleflow · Course

Agentic AI Mastery.

From Software Engineer to AI Architect.

You already use AI to write software. This course teaches you to write software that is AI.

A live weekend cohort for engineers already shipping with Cursor and Claude Code, going from users of AI tools to builders of the agentic and GenAI systems behind them.

Live weekend cohort·19 modules across 5 levels·5 portfolio-grade capstones
The Syllabus

Nineteen modules. Every one exists for a reason.

Grouped into five levels of increasing AI depth. Each module lists what you’ll actually master and what you’ll build.

LEVEL 01

Foundation

MODULE
01

A Brief History of the Evolution of Intelligence

Before we touch a single API, we walk carefully through the eighty-year arc that brought us here, from the first mathematical model of a neuron to the ChatGPT moment. Every design decision inside a modern LLM is a deliberate answer to a question raised decades ago.

Key concepts
McCulloch & Pitts (1943)Rosenblatt's Perceptron (1958)Minsky & Papert and the first AI winterBackpropagation (Rumelhart, Hinton & Williams, 1986)Yann LeCun & LeNetBiological neurons, Hebbian learning, action potentialsSupervised, unsupervised and self-supervised learningFei-Fei Li & ImageNet (2009)AlexNet (2012)Sequence-to-sequence + attentionThe Transformer (2017)BERT, GPT-1/2/3Scaling laws (Kaplan 2020, Chinchilla 2022)InstructGPT and the ChatGPT momentThe bitter lesson
Build

A Perceptron from scratch in NumPy, train it on a linearly-separable problem (Rosenblatt), watch it fail on XOR (Minsky & Papert), then add a hidden layer with backpropagation and watch it succeed.

MODULE
02

LLM APIs & Prompt Engineering

The craft of shipping reliable output from LLMs in production. API mechanics across providers, prompt engineering that holds up under load, structured outputs for real data extraction, and the cost patterns you need from day one.

Key concepts
Multi-provider APIs (Claude, OpenAI, Gemini)StreamingPrompt cachingStructured outputsFew-shot, chain-of-thought, XML taggingPydantic-AI, instructorModel cascadingCost economics
Build

Repo Sage, a CLI that summarises any GitHub repo and answers questions about its architecture.

MODULE
03

From Tokens to Generation: How LLMs Actually Work

The inside of an LLM, piece by piece. Tokenisation, embeddings, and attention, built from scratch so nothing remains a black box. By the end of this module, no LLM behaviour will surprise you.

Key concepts
Byte-Pair Encoding (BPE) from scratchToken vs positional embeddingsRoPE, ALiBiQuery / Key / Value attentionMulti-head attentionCausal maskingCross attention (encoder-decoder, multimodal)Flash Attention intuitionThe full Transformer block (LayerNorm, residuals, FFN, SwiGLU)Decoder-only vs encoder-decoder architectures
Build

Multi-head causal attention in PyTorch from scratch, then nanoGPT, a decoder-only Transformer, trained on Tiny Shakespeare.

MODULE
04
Level 1 Capstone

Retrieval-Augmented Generation (RAG)

Give an LLM access to your data without fine-tuning. The full production RAG pipeline, from embeddings to retrieval to evaluation, grounded in the tradeoffs real engineering teams make.

Key concepts
Embedding models and tradeoffsVector DBs (pgvector, Pinecone, Weaviate, Chroma)HNSWChunking strategiesHybrid search (dense + BM25)MMRRerankingCitation & groundingRAGAS evaluation
Capstone Build

Personal Wiki AI, your own second brain. Markdown notes → hybrid search + AI Q&A with citations. Deployable as a Slack bot or Raycast extension.

LEVEL 02

Agent Builder

MODULE
05

Hacking Claude Code & Cursor as Power Platforms

The extension layer most developers never touch. Build custom skills, hooks, subagents, and workflows on top of the tools you already use daily. By mastering their extension points, you learn agent design by example.

Key concepts
Claude Code skillsHooks (pre/post tool execution)SubagentsBackground tasksCustom slash commandsMCP integrationCLAUDE.md memory hierarchyWorktree isolationHeadless mode in CICursor rules, MDC, custom modes, background agents
Build

A custom Claude Code workflow that ships a PR from a Linear ticket, using skills + hooks + subagents.

MODULE
06

Tool Use & Agent Patterns

The difference between a chatbot and an agent, tool use, reasoning loops, and the pattern decision tree that separates good agents from broken ones. Implemented from scratch so you understand them without framework magic.

Key concepts
Tool / function calling mechanicsReAct pattern (from scratch)CodeAct patternPrompt chainingRoutingParallelisationOrchestrator-workerEvaluator-optimiserPattern decision treeAnti-patterns catalogue
Build

ReAct and CodeAct agents from scratch, compared on a real multi-step research task.

MODULE
07

Memory Systems for Agents

Give your agents the ability to remember, across turns, across sessions, across users. Build a standalone memory service that any agent can plug into.

Key concepts
Memory taxonomy (short-term, working, long-term, episodic, semantic)Conversation bufferSummary memoryEntity memoryPersistent vector-store memoryToken budgetingEviction strategiesMem0 / Letta / LangMem architectures
Build

Personal Memory Layer, a Mem0-style memory-as-a-service with REST API, multi-tenant storage, and decay logic.

MODULE
08
Level 2 Capstone

Agentic RAG

Beyond naive retrieve-then-generate. Multi-step retrieval, query decomposition, reranking, and routers across multiple data sources, the RAG patterns real production systems use.

Key concepts
Query understanding & reformulationSelf-RAGQuery decompositionHyDEContextual retrieval (Anthropic's approach)Parent-child chunk retrievalReranking (Cohere, bge-reranker)Multi-source routingProduction indexing pipelinesGrounding verification
Capstone Build

Mini Claude Code Clone, a terminal coding agent (Aider-style) that reads files, edits code, runs tests, iterates on errors, and plugs into the memory layer you built in this level.

LEVEL 03

Production Engineer

MODULE
09

Model Context Protocol (MCP)

The “USB-C for AI” — the protocol every major AI tool now speaks. Build servers, clients, and deploy them to production with the same patterns Anthropic uses internally.

Key concepts
Architecture (hosts, clients, servers)Transports (stdio, SSE, Streamable HTTP)TypeScript & Python SDKsExposing tools, resources, promptsInput validationOAuth flowsSamplingMCP InspectorPublishing to marketplaces (Smithery, mcp.so)
Build

A production MCP server wrapping a real API (Linear / GitHub / Postgres) with auth, logging, and health checks, published to a marketplace.

MODULE
10

Multi-Agent Systems

Orchestration patterns for agents working together. Supervisor, debate, consensus, and the framework tradeoffs that separate production-ready from research toys.

Key concepts
Planning & decompositionSelf-reflection (Reflexion)Supervisor / orchestratorDebate & consensusHierarchical teamsLangGraph vs CrewAI vs AutoGenHuman-in-the-loop gatesProgressive autonomyLLM-as-judge evaluationSWE-bench / GAIA / AgentBench
Build

A multi-agent code review system (reviewer + fixer + tester) orchestrated with LangGraph.

MODULE
11

Knowledge Graphs & GraphRAG

When vector retrieval isn't enough. Extract entities and relationships with LLMs, build your own graph context layer, and combine graph traversal with semantic search.

Key concepts
Why vector-only fails on multi-hop questionsEntity & relationship extraction with LLMsNeo4j, Kuzu, Apache AGEMicrosoft GraphRAGLightRAGCommunity detectionHybrid graph + vector retrievalIncremental graph updates
Build

GraphRAG over your domain, pick a real dataset (legal / medical / codebase / product docs) and build a knowledge graph with LLM Q&A.

MODULE
12

Sandboxing, Browser Agents & Secure Execution

LLM-generated code cannot run untethered. The full isolation toolkit (gVisor, Daytona, E2B) and browser agents that run safely inside them.

Key concepts
Risk model for agent-generated codegVisor application kernelDaytona workspace lifecycleE2B Code InterpreterFirecracker microVMsWASM sandboxingBrowser Use, Stagehand, SkyvernPlaywright + LLMChoosing isolation levels
Build

A browser agent running inside a gVisor/Daytona sandbox that books a flight or scrapes a real site end-to-end.

MODULE
13
Level 3 Capstone

Production Deployment, Observability & LLM Security

Ship, watch, defend. Agents in production need tracing, guardrails, and red-teaming against the attacks specific to LLM systems, all the failure modes that never show up in development.

Key concepts
Stateless vs stateful agentsQueue-based scalingObservability (Langfuse, LangSmith, Helicone, Phoenix, Braintrust)OWASP LLM Top 10Direct & indirect prompt injectionData exfiltration patternsGuardrails AI, NeMo GuardrailsCanary deploysRegression testing with golden datasets
Capstone Build

Production Coding Assistant (MCP + gVisor + HITL + observability), or Deep Research Clone (long-running citation-backed research agent).

LEVEL 04

Systems Specialist

MODULE
14

Voice & Multimodal Agents

Agents that see, hear, and speak. Build real-time voice agents at sub-200ms latency and computer-use agents that control browsers and desktops.

Key concepts
Voice stack (LiveKit, Vapi, Retell, Pipecat, OpenAI Realtime)Latency budgetingVoice Activity DetectionInterruption & turn-takingVision-language models (Claude, GPT-4V, Gemini)Computer UseOpenAI OperatorDocument AIWhisper, Deepgram, ElevenLabs, Cartesia
Build

Voice Tutor Agent, a real-time voice agent for interview prep on a LiveKit + Cartesia stack, at sub-200ms latency.

MODULE
15

Training & Fine-Tuning Models

Go inside the model. Understand how models are pre-trained and aligned, then fine-tune open-source models for your own use cases with the techniques production teams actually ship.

Key concepts
Pre-training objectivesChinchilla scaling lawsDistributed training (DDP, FSDP, ZeRO, tensor & pipeline parallelism)Full vs LoRA vs QLoRAInstruction tuningRLHF (reward model + PPO)DPO, ORPO, KTOConstitutional AIUnsloth, Axolotl
Build

LoRA fine-tune of Llama 3 8B on a custom instruction dataset, published to Hugging Face.

MODULE
16
Level 4 Capstone

Inference Optimisation

Serving is where models go to die (or scale). KV cache, quantisation, Flash Attention, vLLM, the mechanics of fast, cheap inference at production scale.

Key concepts
Prefill vs decode phasesKV cache internalsQuantisation (INT8, INT4, GPTQ, AWQ, GGUF, FP8)Flash AttentionSpeculative decodingContinuous batchingvLLM, TGI, Ollama, llama.cppTensorRT-LLMMLX (Apple Silicon)On-device / edge AI
Capstone Build

nanochat from Scratch: Karpathy-inspired full stack (pretrain → SFT → RLHF → serve), end to end. Or Voice + Browser Agent combining Modules 12 & 14.

LEVEL 05

AI Architect

MODULE
17

Reasoning Models & Advanced Architectures

The o1/o3 era. Chain-of-thought, tree-of-thought, test-time compute, and the architectures (MoE, Mamba) that are beginning to replace the classic Transformer.

Key concepts
Chain-of-thoughtTree-of-thoughtSelf-consistencyo1 / o3 / Claude Extended ThinkingDeepSeek-R1Mixture of Experts (Mixtral, DeepSeek-V3)State Space Models (Mamba, RWKV)Long-context (Ring Attention, Infini-Attention)Knowledge distillation
Build

Side-by-side comparison of CoT vs ToT vs self-consistency on a reasoning benchmark, with cost / quality analysis.

MODULE
18

Agent Platforms at Scale

The leap from shipping one agent to running a platform. Orchestration engines, governance, compliance, and team structure at organisational scale.

Key concepts
Platform vs point solutionAgent registryTool marketplaceExecution engineEval serviceMulti-tenancyWorkflow engines (Temporal, Prefect)Long-running agents & checkpointingAudit trails, GDPR / SOC2Model & cost governanceAI team career ladders
Build

Personal AI Clone, an agent trained on your writing and voice that answers “as you” (LoRA fine-tune + voice clone + memory layer).

MODULE
19
Level 5 Capstone

Strategy, Career & Frontier

How AI architects think. Patterns for leading AI strategy at your company, navigating the AI job market, and staying ahead of the frontier as it accelerates.

Key concepts
Centralised vs federated AI teamsBuild vs buy frameworkModel selection strategyUse case prioritisation (effort × value)Measuring AI ROIAI portfolio buildingAI engineering interview prepAutonomous coding agents (Devin / Cursor / Claude Code)Vertical AIAGI discourse
Capstone Build

AI Strategy Pitch + Working Demo, a full AI initiative pitched as if to a CTO, with a working demo of one component.

AR
Taught by

Aseem Rastogi.

Software & AI Architect · Co-Founder & CTO, Agentcord.ai

Ex-Architect, iXiGo · Ex-Staff Engineer, Synaptic · B.Tech CSE, NIT Hamirpur (Gold Medalist)

With 12+ years of experience shipping production software, Aseem has architected and led the systems behind some of India's highest-traffic consumer platforms, and over the last 3 years has brought that same depth to building production-grade agentic AI for real businesses.

Today, Aseem is building Agentcord.ai on a deliberate two-layer agentic architecture: a deterministic orchestration plane in LangGraph that handles goal setting, context assembly, and agent routing, paired with a Temporal-based autonomous execution engine that runs long-horizon agent workflows surviving failures, retries, and multi-day cycles. Every agent executes inside a gVisor-sandboxed runtime, communicates with the outside world over an MCP-mediated tool plane, composes its behaviour from a custom Skills layer, and is observed continuously by an Agent Evals harness, with the entire platform deployed on EKS.

Before Agentcord, Aseem was the Architect who led the entire Trains & Bus backend at iXiGo for five years, owning its team and its systems end to end. From that period came CrowdSource Running Status, the flywheel that drives organic customer acquisition at iXiGo to this day, built directly with Prashant Ghidiyal, then VP Tech at iXiGo and today the GenAI and Cybersecurity head at Delhivery (and ex-CEO of Devtron Labs). The system ingests tens of millions of GPS and cell-tower events daily through a real-time streaming pipeline on Apache Flink fused with carrier APIs, and powers a real-time delay-prediction model built on ARIMA that set a new bar for running-status accuracy across the industry. It is also the period in which he led the rearchitecture of the entire trains backend, graduating it from a classic microservices setup into a mature, reactive, strongly-consistent service mesh on Kubernetes and Istio. That programme retired static HAProxy edge routing in favour of dynamic, service-aware routing that treated canary and blue-green rollouts as first-class, replaced the previous Kibana-centric logging with an end-to-end NewRelic + Grafana + Loki observability stack, hardened failure isolation through a significantly improved Hystrix-based compartmentalisation layer, and introduced a Redis + ScyllaDB + Aerospike operational data layer tuned for the new latency and throughput profile. It was designed and shipped directly with iXiGo's CTO Rajnish Kumar and Ram Singla, today CEO of Temple at Eternal (the parent company of Zomato). Out of that same rearchitecture came an availability-prediction model that forecasts train-seat availability across routes and classes, shipping today inside the iXiGo train-search experience. From the same years came iXiGo's Travel Graph, a multi-modal A2B search graph on JanusGraph, ScyllaDB, and ElasticSearch that plans a single journey across bus, train, and flight in one query, and a real-time CDC pipeline on Kafka Connect and Debezium that propagated trip and transaction state across the platform.

He then took the entire Core Data Engineering function at Synaptic under his leadership, a 13-person team working directly with Anurag (CTO of the company), where he architected a workflow orchestrator on EKS, Apache Pulsar, JanusGraph, Apache Spark, and Apache Hudi, with Dask handling out-of-core parallel compute, that today runs 10,000+ concurrent pipelines over 10+ TB of data daily, and built out Synaptic's semantic-search and vector-embedding stack on ElasticSearch and ClickhouseDB alongside the Research team.

He recently served as architectural advisor on Squizify (Australia)'s full-stack modernisation and GenAI adoption, helping the world's first AI-powered Food Security Compliance platform evolve its core compliance workflow from a reactive system into a proactive, anticipatory agentic one. Separately, he continues to provide ongoing technical assistance to the Government of Himachal Pradesh's Hydrology Department, where his real-time sensor pipelines and flood-prediction models now run state-wide.

One thread runs through all of it: production depth, shipped at scale, alongside the engineering leaders who set that bar. This is not a course taught from tutorials. It is taught by the architect who built the systems.

Format & Cadence

Live on weekends. Supported every day.

01

Live weekend cohort

Saturday and Sunday sessions, with recordings of every class.

02

Hands-on

Weekly labs and five portfolio-grade capstones across the five levels.

03

Support

Weekly office hours and capstone reviews per level.

04

Private Discord community

Dedicated channels per module and topic. Ask anything, any time. Every question and answer lives permanently, becoming a growing knowledge base for the cohort.

Tools you’ll master

The full modern agentic and GenAI toolkit.

Grouped by the problem it solves. Every one of these appears in at least one module or lab.

LLM Providers
ClaudeOpenAIGoogle GeminiLlamaMistralDeepSeekQwenOpenRouter
AI Dev Tools
Claude CodeCursorCopilotAiderContinue.devWindsurf
Agent SDKs & Frameworks
LangGraphTemporalClaude Agent SDKOpenAI Agents SDKVercel AI SDKMastraPydantic-AICrewAIAutoGen
RAG & Retrieval
pgvectorPineconeWeaviateChromaQdrantLlamaIndexLangChainCohere RerankRAGAS
Knowledge Graphs & Memory
Neo4jKuzuMicrosoft GraphRAGLightRAGMem0LettaLangMem
MCP & Tooling
MCPMCP InspectorSmitherymcp.so
Sandboxing & Secure Execution
gVisorDaytonaE2BFirecrackerDockerWASM Runtimes
Browser & Computer-Use Agents
Browser UseStagehandSkyvernBrowserbaseAnthropic Computer UseOpenAI Operator
Voice & Multimodal
LiveKitVapiRetell AIPipecatOpenAI RealtimeDeepgramElevenLabsCartesiaWhisper
Observability & Evaluation
LangfuseLangSmithHeliconeArize PhoenixBraintrustDeepEval
Training & Inference
Hugging FaceUnslothAxolotlvLLMTGIOllamallama.cppTensorRT-LLMMLX
LLM Security & Guardrails
OWASP LLM Top 10Guardrails AINeMo Guardrails
What it costs
9,999
per month
A four-month cohort · 16 weeks
Costs less than
  • A pair of premium running shoes
  • Last weekend's bar tab in any metro
  • A single IPL playoff ticket
  • Your monthly Swiggy bill in a heavy WFH month
You leave with
  • 5 portfolio-grade builds shipped to your GitHub
  • The depth most AI engineers in the market don't have
  • A cohort and Discord community for life
  • The vocabulary to lead AI work at your company

The world is in the middle of something rare. AI is reshaping software faster than any shift since the internet. The engineers who go deep right now will shape what gets built over the next decade.

Doors open for the next cohort soon.

Frequently asked

Questions worth asking.

No. The course is built for software engineers who haven't yet touched ML seriously. We start from the Perceptron and backpropagation, then build up through the modern stack. If you can write production code in any language, you can take this.

Join the community

Get your Discord invite.

Drop your email and we’ll send you straight into the Hubbleflow Discord: dedicated channels, cohort updates, and a growing knowledge base for engineers going deep on AI.

We’ll only email about cohort openings. Unsubscribe any time.