Role: Full Stack DE Expert
Experience: 8+ yrs
Location: remote
Shift Timing: UK Shift(2:30 pm -11:30 pm IST)
Company Overview
At Codvo, software and people transformations go hand-in-hand. We are a global empathy-led technology services company. Product innovation and mature software engineering are part of our core DNA. Respect, Fairness, Growth, Agility, and Inclusiveness are the core values that we aspire to live by each day. We continue to expand our digital strategy, design, architecture, and product management capabilities to offer expertise, outside-the-box thinking, and measurable results.
Job Description
• Design, build, and maintain Databricks data pipelines (ETL/ELT) for ingestion, transformation, and orchestration using Spark/Delta Lake/Databricks Workflows.
• Operationalize machine learning models by building inference pipelines that invoke models authored by data scientists (batch or real-time), ensuring consistency between training and inference environments.
• Ensure data reliability, quality, and observability through robust validation, monitoring, alerting, and automated recovery mechanisms. • Collaborate closely with data scientists to productionize models, manage model deployment lifecycles, and optimize inference performance and cost.
• Implement best-practice DevOps/MLOps processes such as CI/CD for pipelines, model versioning, environment promotion, and infrastructure-as-code.
• Optimize performance and cost across compute clusters, jobs, and storage layers.
• Implement and manage the enterprise data catalog, including schema design, table ownership, lineage, governance, and documentation using Unity Catalog.
• Experience with some Databricks infrastructure.
• Experience with building BI dashboards and visualization.
• Experience with coding agents and best practices (spec-driven development, etc.).
Must Have / Nice to Have Skills Required:
• Databricks platform experience • Python development for data processing and ETL pipelines
• Unity Catalog knowledge • AWS data services (S3, IAM, VPC, potentially Glue/Lambda)
• Data lake/lakehouse architecture patterns • Dashboard building experience Nice to Have:
• RESTful API design and development (Flask, FastAPI, or similar)
• Authentication/authorization patterns (OAuth, API keys, IAM roles)
• Query optimization and performance tuning
• PySpark optimization experience
• ML/AI pipeline experience
• Databricks AI/BI