AI Engineer

ProNavigator
ProNavigator

Software Engineering, Data Science

Bengaluru, Karnataka, India

Posted on Jun 30, 2026

Job Description

Key Responsibilities

Harness Engineering: Build and maintain the LLM harnesses that power each AI feature — agent loops, tool/function calling, context construction, memory, retries, and failure handling. Use frameworks like LangChain, LlamaIndex, or LangGraph where they fit; write custom orchestration where they don't.

Prompt Engineering: Design, iterate, and version prompts as first-class assets. Run structured prompt experiments, measure deltas with eval datasets, and keep prompt libraries clean and reviewable.

RAG Pipelines: Build and operate Retrieval-Augmented Generation pipelines that extract information from documents and parse it into structured knowledge bases — chunking, indexing, retrieval, reranking, prompt assembly, and response handling — collaborating with the AI Architect to shape and refine the patterns.

Embeddings & Vector Indexes: Generate embeddings and manage vector indexes (e.g., OpenSearch, Pinecone, pgvector). Tune indexes for retrieval quality and cost.

Data Parsing & Curation: Build extraction and parsing pipelines for documents, structured records, and customer datasets so they are clean, labeled, and ready for downstream AI work.

Accuracy Validation: Write and operate accuracy validation scripts. Maintain evaluation datasets and report quality metrics to the Strike Team and the AI Architect.

Light ML Work: Apply traditional ML where appropriate — classification, clustering, lightweight fine-tuning — to complement LLM-based components.

Analysis: Do the data analysis that informs AI design decisions — sample inspection, error analysis, prompt iteration.

KPIs & Success Metrics

AI feature accuracy and quality against eval datasets meets the project bar.

Prompt iteration velocity with measurable eval deltas on owned features.

RAG and retrieval quality (relevance, groundedness) for owned pipelines.

On-time delivery of AI-side Strike Team commitments.

Eval coverage: golden and regression sets maintained for owned features.

Key Skills & Experience

Education: Bachelor's or Master's degree in Computer Science, Data Science, Statistics, or a related technical field.

Experience: 3+ years of AI or data science experience, with hands-on time in LLM-based application development.

Technical Proficiency:

Strong Python skills, including the standard data science stack (pandas, numpy, scikit-learn).

Hands-on experience building LLM harnesses — agent loops, tool/function calling, structured outputs — against APIs like Anthropic, OpenAI, or Bedrock.

Strong prompt engineering practice: structured iteration, prompt versioning, and prompt evaluation against datasets.

Working experience with at least one orchestration framework (LangChain, LlamaIndex, LangGraph) and at least one vector database.

Comfortable with embeddings, similarity search, and basic retrieval evaluation.

Working knowledge of classical ML for analysis and lightweight modeling tasks.

Comfort using AI coding assistants (Claude Code) for daily work.

Engineering Excellence: Able to write clean, tested code that ships to production — not just notebooks. Familiar with Git, code review, and basic CI/CD.

Analytical Mindset: Strong instinct for data analysis, error inspection, and iterative experimentation.

Preferred Skills & Experience

Agent Frameworks: Hands-on experience with agentic frameworks (LangGraph, Claude Agent SDK, OpenAI Agents) or custom agent harnesses in production.

Evals: Experience building structured eval harnesses (golden sets, regression suites, LLM-as-judge patterns).

Cloud: Hands-on AWS experience (Bedrock, SageMaker, OpenSearch).

Fine-Tuning: Any experience with model fine-tuning or distillation.

Guidewire Knowledge: Familiarity with Guidewire products or the insurance domain is a plus.

Domain Analysis: Prior experience working on document-heavy, regulated, or insurance/finance datasets.