Three anonymised case studies from recent engagements. Metrics are verified, timelines are real. Client names withheld by request.
Each loan application touched 6 different systems: a WhatsApp onboarding bot, a document portal, CIBIL API, an internal risk model, a human underwriter queue, and a disbursal trigger. None were connected. Ops staff spent 4+ hours per application copying data between systems, chasing missing documents, and manually escalating edge cases.
At 150 applications/day and ₹180/hour blended ops cost, the manual processing alone cost ₹1.35L per day — ₹3.6Cr annually.
A 7-agent LangGraph pipeline that orchestrates the entire loan journey autonomously. A document intake agent accepts uploads via WhatsApp and email, runs OCR + classification, and flags missing items. A KYC agent cross-checks PAN, Aadhaar, and selfie verification. A credit agent pulls CIBIL scores and formats them for the risk model. A risk agent runs the client's scoring model and routes decisions to the right bucket.
The human underwriter now only sees applications the system has flagged as requiring judgment — roughly 12% of volume. Everything else is fully automated end-to-end.
"We expected to spend 6 months on this. It was live in 8 weeks and handling 88% of applications fully automatically. The underwriters actually prefer it — they only see the interesting cases now."
— Head of Operations, Series A lending startup (name withheld)The hospital group had a single WhatsApp Business number per location, managed by one receptionist. Messages came in Hindi, Hinglish, and English — voice notes, text, and photos of prescriptions. Staff had no way to track which queries were resolved. After 6pm, patients received no response at all.
The ops manager estimated 3 receptionist-hours per location per day — 15 hours daily across 5 locations — were spent on answerable questions the AI could handle.
A WhatsApp AI agent connected to the hospital's existing appointment system (custom Django backend). The agent handles: appointment booking and rescheduling, doctor availability queries, fee enquiries, test preparation instructions (pulled from a PDF knowledge base), directions and parking, and insurance panel queries.
Voice notes are transcribed using Whisper. Hindi and Hinglish messages are handled natively — we fine-tuned the response prompts on real message samples provided by the hospital. Queries it can't answer are escalated to a human queue with full context and suggested responses pre-filled.
"Patients are now getting instant replies at 11pm. Bookings went up and our reception team is actually less stressed — they deal with complex cases, not 'what time does OPD open'."
— Hospital Operations Director (name withheld by client request)The client's existing approach used rule-based regex classifiers written in 2019 — 67% accuracy on modern contracts, which had become more varied in structure. They'd evaluated GPT-4 and Claude but couldn't use either due to client data policies. They needed a model that could run on a single A100 server in their Frankfurt data centre.
Their CTO had spoken with 4 ML consultancies. Two said it wasn't possible without cloud infrastructure. One quoted €180K and 9 months. We quoted $22K and 6 weeks.
Fine-tuned Gemma 3 4B using QLoRA on their labelled dataset. The training process ran on rented A100s (we used Lambda Labs for training — no client data left their premises because we worked with a 10% sample for prototyping, then the full training ran on their own server using our scripts).
The final model handles 12 contract types, extracts 34 standard clause types, and flags non-standard language for human review. It runs in under 2 seconds per contract on their A100 — processing their entire historical archive in under 14 hours. We shipped an inference API wrapper, a monitoring dashboard, and full retraining documentation so their developers can fine-tune on new contract types without external help.
"Every other agency told us we needed the cloud. Ravi's team understood the sovereignty constraint from day one and built around it — not against it. The model runs on our servers, we own it completely."
— CTO, European LegalTech company (name withheld)First call is free. We'll tell you within 30 minutes whether we can help — and what it would take.