Company Overview

At Codvo, software and people transformations go hand-in-hand. We are a global empathy-led technology services company where product innovation and mature software engineering are part of our core DNA. Respect, Fairness, Growth, Agility, and Inclusiveness are the core values we aspire to live by every day.

We continue to expand our digital strategy, architecture, AI/ML, GenAI, and product engineering capabilities to deliver outside-the-box thinking and measurable business outcomes for our clients.

Role Overview

We are looking for a Senior / Lead Data Scientist with strong expertise in Machine Learning, Deep Learning, and production-grade ML systems, particularly around time-series data, forecasting, and predictive modeling, along with hands-on experience in Generative AI (LLMs), Retrieval-Augmented Generation (RAG), and Agentic AI systems.

This role requires someone who can design, build, optimize, and productionize ML and GenAI solutions end-to-end, work closely with data engineering teams, and take ownership of complex AI workflows. Prior experience leading small teams or mentoring junior data scientists is strongly preferred.

Key Responsibilities

Core ML & Predictive Analytics

Design, develop, and deploy production-grade ML and DL models with a focus on time-series data, forecasting, and predictive analytics.
Build and optimize end-to-end ML pipelines, from data preprocessing and feature engineering to model training, evaluation, deployment, and monitoring.
Apply advanced ML techniques including regression, tree-based models, ensemble methods, deep learning, and optimization algorithms.
Perform feature extraction and dimensionality reduction using techniques such as autoencoders for high-dimensional datasets.
Track experiments, model performance, and metrics using industry-standard tools and best practices.

GenAI, LLMs & Agentic AI

Design and implement LLM-powered applications, including Retrieval-Augmented Generation (RAG) systems for enterprise use cases such as analytics automation, knowledge assistants, and decision-support tools.
Build document ingestion, chunking, embedding, and retrieval pipelines for structured and unstructured data using vector databases.
Develop Agentic AI workflows that enable multi-step reasoning, tool usage, and autonomous task execution.
Integrate LLMs with traditional ML systems to enhance explainability, insights generation, and user interaction.
Implement guardrails and evaluation mechanisms to reduce hallucinations and ensure reliable, grounded LLM outputs.
Optimize LLM inference for latency, cost, and scalability in cloud and hybrid environments.

Required Skills – Technical

7+ years of hands-on experience in Data Science, Machine Learning, or Applied AI roles.
Strong foundation in statistical modeling and machine learning, including:
- Regression, boosting trees, random forests
- Time-series modeling and forecasting
- Optimization techniques (linear, nonlinear, stochastic)
Deep Learning expertiseusing frameworks such as:
- TensorFlow, Keras, PyTorch
- Experience with RNN, LSTM, GRU, CNN is a plus
Experience with NLP and unstructured data processing.
Hands-on experience with LLMs and GenAI, including:
- Retrieval-Augmented Generation (RAG)
- Vector databases (FAISS, Chroma, Pinecone, or similar)
- Prompt engineering and LLM evaluation
- Agentic AI frameworks (e.g., LangChain, LangGraph, or similar)
Strong programming skills in Python (R is a plus); familiarity with sh/bash scripting.
Experience working with SQL and NoSQL databases.
Experience building and consuming REST APIs and web services.
Exposure to Big Data tools (Spark, Hadoop, or similar) is a strong plus.
Cloud experience (AWS / GCP / Azure); exposure to GenAI platforms (AWS Bedrock, Azure OpenAI, Vertex AI) is a plus.

Data Science Lead (India) (Remote)

Submit Your Application