Program Curriculum
Agentic AI Upskilling Program — Applied Case Studies for Real-World Systems
Module 1 — Generative AI Foundations
Core Concepts
- What makes an AI application intelligent?
- Converting traditional systems into GenAI-powered architectures
- Prompting techniques, RAG, Context engineering for AI reliability
- Swift, practical immersion into how modern GenAI systems work
- Setting up your local machine
- LLM server, LLMs, VectorDB, Embeddings
- Ollama, Llama3, Qdrant, bge-large
PII Detection using LLMs
Build a privacy-aware intelligence layer
Module 2 — Core GenAI Components
Core Building Blocks
- Launching and managing an LLM server
- Model invocation patterns for real applications
- VectorDB operations that power intelligent search
- Semantic search with Embeddings to accomplish intelligence
- Building blocks behind scalable GenAI applications
FinTech: Credit Card Fraud Detection
Build a VectorDB powered smart fraud prevention pipeline
Module 3 — Agentic AI
Autonomous AI Workflows
- Why Agentic AI Matters?
- When to use agents instead of simple LLM prompts?
- Agentic integrations with PDFs, GitHub, Web scraping and Databases
- Multi Agent collaboration with Agno, CrewAI, LangGraph
- Function calling for tool invocation
- Autonomous Agents that think & collaborate with external world tools
Loan Processing Workflow
Build a multi stage automation with Agentic AI
Module 4 — Model Context Protocol
MCP for Real-World Integrations
- What is Model Context Protocol?
- How MCP standardizes tool access for LLMs?
- When to use MCP over custom integrations?
- Designing MCP-powered workflows
- Best practices for secure & reliable tool execution
- Apply MCP to accomplish intelligent external systems interaction
UI Automation with Playwright MCP
Automate entire data entry work with prompts only
Module 5 — Multimodal AI
Audio Intelligence
- Speech-to-text transcription for recorded and live audio
- Automatic Speech Recognition across multiple languages
- Whisper & Vosk server for high-accuracy recorded & live transcription
- Convert real conversations into structured insights
Image Intelligence
- Automatic image interpretation and scene understanding
- Text extraction from images for OCR-driven workflows
- Explaining actions, objects and relationships within an image
- Visual understanding that extracts meaning, text and context from images
Video Intelligence
- Video question answering for scene-level reasoning
- Video classification for content tagging and detection
- Video captioning for summarizing visual sequences
- Deep video understanding for analysis, automation and interactive systems
Multimodal Models
- Gemini-2.5-Pro | Gemini-Live-2.5-Flash-Native-Audio
- Qwen2.5-VL | Llava3.1
Customer Support Automation
Transcribe audio calls and identify customer attrition cause
Module 6 — AI Labs for Real-World Scenarios
Compliance: PII & GDPR Data-Leak Detection
- Build an LLM-powered layer that detects and redacts sensitive data across languages
- Identify PII, financial identifiers, and personal details in real-world customer messages
- Intercept and analyze both human- and LLM-generated content before delivery
- Auto-mask sensitive fields while preserving meaning and readability
- Ensure GDPR-aligned compliance without slowing communication workflows
FinTech: Credit Card Fraud Detection
- Build a real-time fraud-detection system to analyze incoming transactions for risk signals
- Ingest customer spending history from PDFs into a VectorDB for contextual intelligence
- Use embeddings and semantic search to spot deviations from normal behavior
- Flag suspicious activity with clear, explainable reasoning
- Create a much more adaptive & intelligent flow than the rigid rule-based systems
DevTools: Automated Programmer + QA Engineer
- Build an Agentic System That Codes, Tests & Delivers software End-to-End
- Convert functional specs into production-ready programs across multiple languages
- Generate end-to-end QA test suites that align semantically with the implementation
- Accelerate software delivery while improving reliability and coverage
- Automate major parts of the SDLC
End-to-End AI Lab Sessions
Apply all concepts to real-world compliance, fraud, and automation scenarios
Module 7 — AI Integrations for Real-World Scenarios
Ops: Loan Processing Automation
- Build a multi staged loan processing workflow
- Automate task planning & execution using specialized agents
- Orchestrate multi agents: Collector, Verifier, Approver, Disbursement
- Apply Agentic AI to deliver compliant & explainable end-to-end automation
External Integrations with MCP
- Automate UI workflows with Playwright MCP for browser interactions
- Perform NoSQL operations with MongoDB MCP
- Accomplish transactional tasks with PostgreSQL MCP
- Generate QA regression suites using GitHub MCP
- Extract structured insights from documents using PDFKnowledgeBase MCP
- Intelligent web-scraping with CrawlAI MCP
Full-Stack Agentic Integrations
Connect AI agents to browsers, databases, and external APIs via MCP
Module 8 — Multimodal AI for Real-World Scenarios
Audio Processing
- Transcribe customer–agent conversations with high accuracy
- Extract intent, sentiment, key notes, and action items from raw speech
- Build pipelines that convert audio into searchable, structured insights
Image Processing
- Automate invoice extraction using multimodal models with LLaVA
- Interpret medical images with advanced multimodal reasoning with Gemini
- Extract identities from Aadhar/SSN images & authenticate against DB records
Video Processing
- Summarize customer-service videos with key actions and insights
- Query videos directly and receive precise, context-aware answers
- Detect events, understand scenes, and retrieve relevant moments
- Build video-aware AI systems for compliance & operational intelligence
Multimodal Production Pipelines
Build end-to-end audio, image, and video intelligence systems
