Production discipline, not pilot theater.
Optivus is an India-based AI consultancy. We build production AI systems for enterprise clients across knowledge retrieval and RAG, agent workflows, finance and operations automation, and conversational AI. Production deployments include Janus (hiring AI), Veritas (knowledge-grounded content and SEO), and FlowFin (AP/AR automation in active pilot with Indian enterprises).
Our positioning is production discipline. We ship systems that survive past the pilot. Eval harnesses, prompt registries, observability, and audit infrastructure are not optional in our work.
Own the engagement, end to end.
You will own a client engagement from first scoping conversation to delivery handoff. That means scoping with the client's CTO or business lead, designing the system, architecting and shipping it, building the eval and observability infrastructure around it, and walking the client's team through how to operate it.
You will work directly with the founders. There is no engineering manager between you and the work, and there will not be one. Engagements span manufacturing, financial services, staffing, logistics, e-commerce, healthcare, and professional services, so you will be context-switching between domains on a quarterly cadence.
The shape of the work.
- Lead 2 to 3 active client engagements at any time, from discovery through handoff.
- Design system architecture, choose models, design retrieval and agent strategies, and own the technical tradeoffs.
- Build production systems with eval harnesses, prompt versioning, telemetry, and audit trails from day one - not bolted on after launch.
- Run scoping and discovery sessions with client CTOs, COOs, and VPs of operations.
- Translate ambiguous business asks into specs, milestones, and shippable systems.
- Mentor junior engineers on the team as we scale headcount through 2026.
What we're looking for.
- Production engineering experience.
- 5+ years shipping production software with strong Python proficiency. You write maintainable code, you understand systems design, and you have shipped backends that take real traffic.
- LLM systems in production.
- 2+ years building production LLM or agent systems that customers use. Not pilots that died. Not POCs. Not internal demos. Systems running in production with traffic, SLAs, and real incident postmortems in your history.
- RAG and retrieval, hands-on.
- You have built non-trivial RAG systems. You know why naive cosine similarity over chunks fails on real corpora. You have implemented hybrid retrieval, reranking, and query rewriting, and you have measured retrieval quality empirically rather than by vibes. Direct experience with at least one of pgvector, Pinecone, Weaviate, Qdrant, or equivalent.
- Agent systems, with judgment.
- You have shipped agent workflows in production and have opinions about when to reach for a framework (LangGraph, CrewAI, etc.) and when to write the orchestration yourself. You understand tool use, structured outputs, and the failure modes of long-horizon agent runs.
- Eval and observability.
- You have built eval harnesses for LLM systems beyond “we eyeballed 20 outputs.” You understand offline evals, online evals, golden datasets, and regression testing for prompts and models. Hands-on with LangSmith, Langfuse, Helicone, Braintrust, or your own rolled telemetry.
- Claude Code or Codex as primary dev environment.
- You use Claude Code or Codex natively. By that we mean you have rebuilt your engineering workflow around these tools, your tooling config (CLAUDE.md, custom slash commands, skills, prompt libraries, MCP servers) has been deliberately curated over months, and you can articulate which tasks you delegate to the model and which you do not. This is a hard requirement, not a preference.
- Cloud and infra.
- Production deployment experience on AWS or GCP. Comfortable with Docker, CI/CD, Postgres, and the operational realities of running systems other people depend on.
- Client-facing skill.
- You can run a client engagement without supervision, including scoping calls with senior stakeholders, technical discovery, architecture decisions, delivery management, and the handoff conversation.
A leg up, not required.
- Integration experience with Indian enterprise systems (SAP, Oracle, Tally, IBM Maximo, Microsoft Dynamics, Zoho, Darwinbox).
- OCR and document AI work, ideally on invoice or financial document extraction.
- Multimodal AI experience: audio, video, image generation pipelines.
- Prior consulting, solutions engineering, or forward-deployed engineer background.
Save us both the time.
- People who talk about AI well but have not shipped it.
- Engineers using AI development tools as fancy autocomplete without rethinking how they architect.
- Anyone who needs an engineering manager to make decisions for them.
- Generalists who think LLM systems are just another API integration.
- Engineers who treat evals and observability as nice-to-haves to bolt on later.
More than a job description.
- Direct working relationship with both founders, no middle layer.
- Real ownership of client accounts and the architectural decisions that come with them.
- Exposure to senior enterprise stakeholders (CTOs, COOs, VPs of operations) and the kind of high-stakes shipping that builds a career faster than another year inside a large engineering org.
- Budget for tools, infrastructure, and continued learning.
- Latitude to shape engineering practices as we scale the team.
Five steps. No surprises.
- Application review against the three written prompts below.
- 45-minute call with Advik (founder, CEO).
- Technical deep-dive with Udayan (founder, CTO): system design and a real engineering decision from your past work.
- Paid take-home or live working session, scoped to 4 to 6 hours.
- Final conversation, references, offer.