Shivang Singh.
I build and scale GenAI systems in production — where latency, token limits, and failure modes matter as much as model quality.
Bodhi Atomize
Production multimodal GenAI platform decomposing 10,000+ marketing assets into 50+ structured signals per asset for Eli Lilly. Multi-stage LLM pipelines with token budgeting, backpressure, and KEDA-autoscaled microservices.
8-agent job-search SaaS · 79 companies · ~$0.04/run · M4 wip
Obsessed with structured outputs, LLM evaluation, and production reliability under burst traffic.
I design and operate LLM pipelines that handle real traffic. At Publicis Sapient I lead Bodhi Atomize — a multimodal GenAI platform that turns images, videos, and GIFs into structured signals for enterprise clients like Eli Lilly. Previously shipped object detection and defect detection systems improving accuracy and inference speed at scale.
My work sits at the intersection of GenAI systems, computer vision, and production ML engineering — where latency, token limits, retries, backpressure, and failure modes matter as much as model quality.
- Architected Bodhi Atomize — production multimodal GenAI platform cutting marketing asset analysis from hours to ~2 min per asset (95% reduction) across 10,000+ assets for Eli Lilly. Outputs 50+ structured JSON signals per asset.
- Engineered multi-stage LLM inference pipelines with Gemini 2.5 Pro and Pydantic-validated structured outputs. Implemented token budgeting, exponential-backoff retry, and backpressure control to sustain production throughput under rate limits.
- Integrated YOLO and PaddleOCR into LLM workflows, extracting 50+ typed visual components (text, characters, emotions, branding) per asset. Established LLM evaluation with DeepEval (LLM-as-judge, G-Eval).
- Built FastAPI microservices with Redis (caching + task queuing) and Celery. Deployed on Kubernetes with KEDA autoscaling to sustain 1,000+ concurrent requests under burst traffic with low latency.
- Integrated RF-DETR into production pipelines — 1.8× faster inference and +7% mAP50 improvement over YOLOv8 baseline on industrial defect detection.
- Curated and preprocessed 30,000+ industrial images through targeted augmentation and annotation QA pipelines, lifting defect detection accuracy by 10%.
- Led the supervised modelling team predicting urban farming zones in Milan using geospatial data.
- Engineered XGBoost model achieving 93.68% accuracy. Conducted EDA on 106,000 rows with Geopandas.
- Implemented real-time predictions, optimising data handling and model efficiency.
- Led the Computer Vision domain for the campus AI/ML club. Mentored juniors, ran workshops, organized hackathons.
- Co-led campus tech club. Organized events, hosted talks, fostered project-driven learning.
Dossier
Quality-first agentic job-search SaaS
8-agent autonomous pipeline (Persona Builder, Job Discovery, Watchlist, Company Intel, Gap Analysis, Market Intel, Resume Agent, Referral Finder) that finds, scores, researches, and surfaces roles most worth your time. Profile-driven scoring across 79 hand-picked companies, pre-LLM rule filter drops ~60% of jobs at zero cost, Claude generates ATS-optimised LaTeX resumes via 3-pass self-evaluation (Sonnet tailor → Haiku critic → Sonnet revise). M2+ wraps the CLI in a Next.js 16 + FastAPI + Clerk multi-user SaaS with credits, SSE progress, and async worker.
FedFV-CV
Federated Deep Learning for Biometric Auth
Federated deep learning framework for finger-vein biometric authentication using MobileNetV2. Engineered custom FedWPR aggregation on 122,600 images across 5 clients, outperforming FedAvg benchmarks. B.Tech Thesis, IIIT SriCity.
slackAgent
AI-Powered Slack Bot with RAG
Scalable FastAPI backend with LlamaIndex + ChromaDB semantic search over 20+ documents. Cut query response time by 40% and served 50+ daily queries via Slack API with end-to-end automation through n8n.
RAG-QA on AWS
Retrieval-Augmented QA, fully CI/CD
Retrieval-augmented QA system using LangChain, FAISS, and AWS Bedrock (LLAMA 3.1-70B). Deployed to AWS ECR + App Runner via Docker with full CI/CD through GitHub Actions.
LLM & GenAI
10 itemsComputer Vision
5 itemsMLOps & Backend
7 itemsCloud & Infra
8 itemsProgramming & ML
6 itemsSkip the scrolling. Ask my AI anything about my production GenAI work, side projects, hiring fit, or technical decisions. Streams answers grounded only on my real experience.
Latest from the blog.
All postsBuilding Dossier in Public: The M4 Milestone
Four milestones in, Dossier is no longer a side project — it's an agentic job-search system with real users. Here's what shipped, what broke, and why I'm building it on LinkedIn.
Dossier: An 8-Agent Job-Search Pipeline That Survives Production at ~$0.04 per Run
Why I stopped building a job board and started building a quality-first agentic pipeline. The architecture, the cost discipline, and the SaaS layer I wrapped around it.
What I Learned Shipping a Multimodal GenAI Platform to Production
Bodhi Atomize processes 10,000+ marketing assets for Eli Lilly. Here's what actually broke, what scaled, and what I'd build differently next time.
Open to conversations around GenAI systems, LLM infrastructure, ML engineering, and production AI challenges. Drop a line — I reply fast.